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1.00 


INTRODUCTION* 


With the cultivation of some 200 commercial crops, California's 
agriculture Is one of the most diversified In the world. The State 
leads the nation In the production of 47 commercial crop and live- 
stock commodities and Is one of the top five producers of an addi- 
tional 19. Gross cash receipts from farm marketings In 1978 totaled 
$10.4 billion. With this Income, California continues as the leading 
farm state with nearly 10 percent of the nation's cash receipts. 

This abundance of agricultural output results from the cultivation 
of approximately 13.3 million hectares (32.8 million A, 1978). The 
combined acreaoe of principal crops In 1978 totaled 3.8 milllor hectares 
(9.4 million A), a 5 percent Increase in harvested area since 1977. 

Field crops (2.7 million hectares, 6.7 million A), fruit and nut crops 
(.7 million hectares, 1.7 million A) and vegetable and melon crops (.4 
million hectares, ,9 million A) yielded 43.1 million tons of harvested 
farm products. 1 Much of the success of this agricultural production Is 
founded on the availability of water for Irrigation. The California 
Department of Water Resources (OWR) estimates that approximately 3.8 
million hectares (9.5 million A) are Irrigated at least once during the 
growing season. This water Is derived from surface sources, ground- 
water extraction and the construction of large-scale water transport 
projects. Agriculture Is the prime recipient of the available water, 
utilizing about 85% of the supply. 

In 1957, California Water Code Section 10005 established the 
California Water Plan. It Is a "comprehensive master plan to guide 
and coordinate the planning and construction of works required for the 
control, protection, conservation and distribution of the water of 
California to meet present and future needs for all beneficial uses 
and purposes in all areas of the State". 2 The 'responsibll ity for up- 
dating and supplementing the Plan was assigned to the Department of Water 
Resources. 

"The Department carries out this responsibility through a statewide 
planning program, which guides the selection of the most favorable pattern 
for the use of the State's water resources, considering all reasonable 
alternative courses of action. Such alternatives are evaluated on the 


★ 

All principal measurements and calculations were performed using 
customary units. 

^ Department of Food and Agriculture, State of California, "California 
Principal Crop and Livestock Commodities - 1978" 

2 Department of Water Resources, State of California, "The California 
Water Plan Outlook In 1974, "Bulletin No. 160-74, November, 1974 


basis of technical feasibility and economic, social, and institutional 
factors. The program comprises: 

. Periodic reassessment of existing and future demands for 
water for all uses in the hydrologic study areas of 
California. 

. Periodic reassessment of local water resources, water uses, 
and the magnitude and timing of the need for additional water 
supplies that cannot be supplied locally. 

. Appraisal of various alternative sources of ground water, 
surface water, reclaimed waste water, desalting, geothermal 
resources, etc. - to meet future demands in the areas of 
water deficiency. 

. Determination of the need for protection and preservation of 
water in keeping with protection and enhancement of the 
environment. 

. Evaluation of water development plans. ^ 

A summary status of conditions and expectations is published every four 
years in the form of a comprehensive bulletin (Bulletin 160) that is used 
to provide information to aid in guiding and coordinating the use of 
California's water resources. 

To meet these responsibilities, DWR has long recognized the need for 
specific land use data as an input to state water planning. Since the late 
1940's the Department has been performing a continuing survey to monitor 
land use changes over the state. Because of manpower and budgetary con- 
straints, only a portion of the state (approximately one-seventh) is 
surveyed during any given year. In DWR's surveys, two types of output are 
produced, (1) land use surveys which record the nature and extent of present 
water-related land development, and (2) land classification surveys de- 
signed to determine the location and extent of lands with physical charac- 
teristics suited to specific kinds of development. The more pertinent of 
these surveys to the projects discussed in this report, is the land use 
survey. It is compiled through the interpretation of current 35 mm aerial 
photography supplemented with field inspections. Tabulations of the acreage 
of each specific land use class are then sumnarized by 7-1/2 minute quad 
sheet, county and other area subdivisions such as water agency or hydrographic 
area. Figures 1-1 and 1-2 show the land use legend and a completed land use 
map prepared by DWR. 

As seen in Figures 1-1 and 1-2, each parcel of agricultural land has 
been designated as either irrigated, the prefix "i," or non- irrigated, "n". 
This condition is determined by the interpretation of aerial photography 
and the gathering of supplementary field data as mentioned above. From the 
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data collected, OWR 1s able to generate maps showing the land use classifl* 
cation to cover type, Including crop Identification, and the acreage of 
Irrigated lands. Since each land use Is associated with a specific water 
demand, total water consumption forecasts can then be made. Due to the 
limitations of the one date survey, however, the DWR survey Is not con- 
sidered accurate as to the proportion of acreage devoted to small grains 
or multiple cropping. 

California receives an annual average of 200 million acre-feet of 
precipitation. Most of the runoff, approximately 70,000,000 acre-feet, 
occurs In areas with the lowest population densities. As a result large- 
scale water systems, both state and federal, have been constructed to 
store and transport water from the areas of accumulation to the areas of 
demand. In recent years California like much of the West had been 
experiencing a major drought. In normal years approximately 851i of the 
total water used would be consumed by agriculture. In 1977, state officials 
were expecting a 10 million acre-foot deficit. Because of this, state 
and federal water managers initiated stringent reductions In water deliveries. 
The Bureau of Reclamation reduced its deliveries to less than half of 
their usual Central Valley Project (Federal jurisdiction) allotments. Like- 
wise, the State Water Project (State jurisdiction) ••*as forced to curtail 
water deliveries to its 24 contracting districts to 1.8 million acre-feet, 
down 1.6 million acre-feet from projected demand. Since California alone 
supplies approximately 40S of the nation's summer fruit and vegetable crops, 
the loss in production caused by the drought impacts agribusiness and price 
structures nationwide. 

The drought dramatically emphasized the need for accurate and 
timely information on the extent of irrigation and the nature of agri- 
culture as input for water management decisions. In addition to their 
normal survey techniques, DWR has been actively participating since 1975 
with NASA and the University of California on several projects designed 
to investigate the feasibility of estimating Irrigated acreage and de- 
termining cropping practices within the state utilizing a Landsat-based 
remote sensing system. Based on the results of these studies, information 
acquired from the analysis of satellite imagery may become a valuable 
supplement to the land use information presently collected by DWR. The 
use of the satellite system allows DWR the opportunity to analyze data 
from several dates during the growing season and the ability to collect 
data over the entire state in one year. 
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SUMMARY OF RESULTS: 15 APRIL 1975 - 31 DECEMBER 1978 



Since 1975, the California Department of Water Resources has been 
cooperating with NASA and the University of California on a series of 
projects designed to address the applicability of satellite data as 
Input to water management decisions. Research and demonstration have 
been supported by: 1) NASA Contract NAS 5-20969 (Goddard Space Flight 
Center) and 2} NASA Grant NSG 2207 (Ames Research Center). Although 
the majority of the described work In this final report was supported 
by Ames Research Center, the results are based on preliminary work 
developed under contract with Goddard Space Flight Center. Because each 
succeeding year's work has been based on accomplishments of the previous 
years, a description of the methodologies, rationales and results of the 
lnltV‘1 contract work are Included. Following that, summaries of the 
first two years of this grant are given. The remainder of the final 
report details the results of this year's (1979) effort. Further work 
on the project Is continuing under NASA Cooperative Agreement NCC 2-54 
(Ames Research Center). Figure 2*1 diagrams the stepwise support of the 


1975 1976 197;’ 1978 1979 1980 

JFMAMJJASONDJFMAMJJASONDJFMAMJJASONDJFMANJJASOWDJFMAMJJASONOJFMAMJUASONO 
NASA CONTRACT NAS 5-20969* 


NASA GRANT NSG 2207* 


NASA COOFERATIVe^, 
ACREEMENT NCC 2-5H*** 


*DWR, NASA/oODOARO/ UNIVERSITY OF CALIFORNIA (BERKELEY) 

**DWK^ NASA/AMESi UNIVERSITY OF CALIFORNIA (BERKELEY AND SANTA BARBARA 


*OWR, NASA/ANESi UNIVERSITY OF CALIFORNIA (BERKELEY AND SANTA BARBARA 


Figure 2-1. Work on the estimation of Irrigated land In California has been 
supported by NASA since 1975. This figure diagrams the timeframes of the 
three funding vehicles. NASA Grant 2207 provided the support for the work 
covered by this final report. 
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2.1 AN INVENTORY OF IRRIGATED LANDS FOR SELECTED COUNTIES WITHIN 
THE STATE OF CALIFORNIA BASED ON LANOSAT AND SUPPORTING 
AIRCRAFT DATA (15 April 1975 - 15 January 1977) 


The first cooperative effort between NASA/GSFC, OWR and the Rerote 
Sensing Research Program of U.C. Berkeley began on the 15th of April 1975 
and continued until January 15. 1977. The three main objectives of the 
study were: (1) to develop a process for providing irrigated acreage on 

a regional basis using Landsat; (2) to develop a technique that would 
provide this estimate in one year, and (3) to achieve a level of pre- 
cision for the State to within at the 99% level of confidence. 

Selected in conju'Ction with DWR, ten counties representing much of 
the agricultural diversity found in California were defined as the study 
area. Seven of the ten counties were located in the Central Valley; 
others were located in coastal and mountain areas. After exclusion areas 
had been removed, the total population subject to sampling and inter- 
pretation was approximately 1,500,000 hectares (3,707,000 A). Exclusion 
areas ware defined as areas not subject to irrigation (urban, wildland, 
wildlife refuges) and areas where information on irrioation was so good 
as to make sampling unnecessary (established orchards). 

A three phase sample design based on a sampling frame of area units 
with stratification by county was used. The three phase design was selected 
to maximize the advantages of spectral reflectance and field pattern 
(auxiliary variable data) available on Landsat and aerial photography as 
they relate to irrigated acreage. Multiple dates of Landsat were used as 
Phase I to provide relatively inexpensive, county-wide estimates of irri- 
gated proportion. Multitemporal vertical color aerial photography, used 
as Phase II, provided a cost-effective means to correct the Landsat esti- 
mates for bias. Finally measurements made on a small sample of Phase III 
ground units were used, in turn, to calibrate the aerial photography 
estimates and provide the most accurate information on crop type and 
irrigation. 

A rectangular sample grid of 1.6 by 8.0 kilometer (1x5 miles) sample 
units was defined to cover each county. Since no prior irrigated acreage 
variance versus sample unit dimension data was available, sample unit size 
and shape were chosen based on practical considerations. These consider- 
ations dealt with ease of data acquisition and meas< ;*ement at each 
sample stage. 

In order to determine Phase I, II and III sample sizes by county 
(stratum) that would be expected to support the statewide t 3%, 99% level 
of confidence irrigated acreage precision goal, a preliminary population 
model was constructed. Sample size (number of sample units) allocations 
were based on previously published estimates of proportion of area irri- 
gated by county, approximate between phase cost ratios and a non-linear 
programming algorithm which minimizes cost, subject to constraints on 
variance. Samples were allocated with equal probability at each sample 
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phase within each county. Sample units eligible for selection were 
confined to those selected for measurement at the previous phase. For 
the entire ten county test site, 1292 Phase I, 90 Phase II and 18 Phase 
III units were selected. 

A three phase regression estimation system was chosen to provide 
Irrigated acreage proportion estimates. This model was thought to repre- 
sent the between phase proportion relationships most accurately. The 
mean and variance estimators followed the treatment given by Tikkiwal 
(1955 and 1967). Basically, the estimators were Iterative such that the 
Phase III (ground) estimator used the Phase II (aerial photo) estimator 
which In turn used the Phase I (Landsat) estimator. 

The multitemporal capabilities available with Landsat offer obvious 
benefits for monitoring an agricultural growing season. Three time 
periods were selected for analysis: (1) June - to monitor small grains 

and establish a base for multiple cropping, (2) August - to provide data 
on maximum canopy coverage expected for many irrigated crops, and (3) 
September - to continue multiple cropping observations. Interpretation 
for Phase I was done on Multi date Landsat mosaics of each country that 
had been enlarged to 1:154,000. The August imagery acted as the base 
date for two major reasons. First, It Is the height of the growing season 
when maximum vegetation cover is present. And second. In nearly all of 
the agricultural areas of California if a crop Is growing In August It can 
be safely assumed to be Irrigated. The May and October date Imagery Is 
used to control early harvested crops and multiple cropped areas. 

The multi temporal large scale color aerial photography used as Phase 
II was procured using a Twin Comnanche aircraft, equipped with a vertical 
closed circuit Tf system for location and a Nikon 35mm camera for photo- 
graphy. After enlargement to the standard 3R size (scale approximately 
1:21,000) the photography was mosaicked into strips that covered each sample 
unit. Each sample unit was then interpreted to obtain an estimate of the 
Irrigated area within it. Multitemporal ground data (Phase III) was also 
collected for a sub-set of the sample units flown with aerial photography. 

The results of the Interpretation and ground data collection were 
tabulated and Input to a Fortran program, MPHASE, written at Berkeley and 
designed to calculate the multiphase estimate, variance, standard error, 
relative standard error and sample correlation coefficients for each 
county. Results were a regional estimate, summarized by county which 
calculated 80.17i5 (1,202,401 ha*, 2,971 ,827 A) of the population estimated 
to be Irrigated. The confidence interval of the estimate Is shown below. 
Since the oooulation samoled in this study represented less than half 
the agricultural land in California, a sample of the larger area would 
be expected to produce precision performance approaching ^ 3% at the 
99% level requested by DWR for statewide reporting. 


Table 2-1. Confidence Interval of the estimate of irrigated land 
in ten counties in California, 


Confidence interval of the estimate 
(half width expressed as percent) 

1 - a = .68 1 - a = .95 1 - o = .99 

t = 1.00 t = 1.98 t = 2.358 

+ 2.73 + 5.41 + 6.44 


Evaluation by a University of California resource economist found 
that the costs of the inventory compared favorably with a hypothetical 
DWR-style survey of irrigated acreage only (approximately 3t hectare/1. 2i acre). 
He further found that the results approximated comparable estimates pro- 
duced by the Ag Census and county agricultural commissioners. In addi- 
tion it was feasible to complete the project and generate the statistics 
within the OWR time requests. 

With encouraging results from this first effort, a second project was 
undertaken. This cooperative study by NASA/ARC, OWR and the University 
of California (Berkeley and Santa Barbara campuses) was to continue the 
development of techniques to optimize the estimation of irrigated acre- 
age and test on a large yardstick region. Additionally, work was ini- 
tiated on the use of computer assisted analysis techniques for estimating 
irrigated acreage. Work on manual and computer assisted analysis tech- 
niques for determining specific crop types was also begun. 

2.2 DETERMINING THE USEFULNESS OF REMOTE SENSING FOR ESTIMATING 

WATER DEMAND IN CALIFORNIA (1 January 1977 - 28 February 1978) 

Two main test sites were selected for study: (1) the Sacramento 

Valley (1,977,000 hectares, 4,885,000 A) in northern California, and Kern 
County (404,700 hectares, 1,000,000 A) in the southern San Joaquin Valley. 

On both sites the DWR had performed 100% land use surveys. 

In the Sacramento Valley Test site the proportion irrigated was esti- 
mated for the entire fourteen county region using manual analysis techniques 
developed in the previous project. To this end, a stratification recommended 
at the end of the Irrigated Lands Project was developed and produced for the 
region. 
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The main purposes of the stratification were to more optimally allocate 
sample unUs and control measurement error. The stratification developed 
was based on agricultural practices, environmental conditions and field 
size, the major factors affecting manual analysis performance. Six agri- 
cultural practice strata were identified. These strata were composed of 
areas that, on Landsat 1:1,000,000 color composite transparencies, appear 
to be: 


Stratum Number Stratum Description 

1 Generally dry farmed 

2 Field Crops - fields 
generally less than 16 
hectares (40 A) 

3 Field Crops - fields 
generally from 16-32 
hectares (40 - 79A) 

4 Field Crops - fields 
Generally 33 hectares 
(80A) or greater 

5 Orchards & vinyards - 
fields generally less 
than 16 hectares (40 A) 

6 Orchards & vinyards - 

fields generally 16 hectares 
(40 A) or greater j 

- i 


The multiphase sampling design was maintained although some refinement 
was necessary; sample allocation was controlled by the strata and a two- 
phase rather than a three-phase sample design was used. Since the project 
was begun after the 1976 growing season, multi temporal aerial photography 
and ground data were not available. The photo and ground measurements were 

considered as a single phase (Phase II). The + Sv at the 99c level of 

confidence for statewide estimation was continued. Based on this 1330 

Phase I and 141 Phase II units were allocated. 


Multi temporal Landsat from 30 May, 28 August and 3 October was used 
with Interpretation and tabulation as before. The results of this study 
showed 54.24% (1,072,277 ha, 2,649,561 A) of the population estimated to be 
Irrigated. A decrease In the relative error from + 2.73% to + 1.52% was 
achieved. The resulting confidence Intervals are ihown below: 


Table 2-2. Confidence Interval of the estimate of Irrigated land in the 
fourteen-county Sacramento Valley Test Site. 


Confidence Interval of the estimate 
(half width expressed as percent) 

1 - o = .68 1 - a = .95 1 - a = .99 


t =1.00 t » 1.992 t = 2.631 




In addition a single county comparlsoi. was completed in which 21 indi- 
vidual 7-1/2 minute quadrangles were compared. Based on this comparison the 
Landsat measurement came within 4,047 hectares (10,000 A) of the 117,289 
hectares (289,816 A) tabulated by OWR. Interpretation performed, on the same 
area by OWR personnel familiar with the county, came within 2,428 hectares 
(6,000 A) of the tabulated total. 

In selected areas within the Sacramento Valley Site, computer assisted 
analysis techniques were tested for the estimation of irrigated land. Using 
multitemporal digital tapes, unsupervised and maximum likelihood classifi- 
cation techniques were used to estimate proportion irrigated on a single 
7-1/2 minute quadrangle. Based on this classification, 67.52% of the area 
was estimated to be irrigated compared to 64.54% irrigated tabulated from 
the OWR land use survey. 

In Kern County, a study to test the ability to m^ irrigated land using 
manual analysis techniques and multitemporal Landsat was also done. Work 
based on earlier studies with the Kern County Water Agency Indicated that 
95% of the area was correctly mapped as Irrigated. 
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The second phase of this project was devoted to specific crop type 
estimation and mapping. Certain basic data was used in the work done on 
the various test sites which included: ancillary data such as historical 
crop acreages, trends, crop calendars; 100% aerial photography and ground 
data from DWR; regional multidate crop keys (Landsat); regional crop 
determination matrices; and multitemporal Landsat data in a variety of 
forms. A major task within this phase was the estimation of small grains 
within the entire Sacramento Valley Test Site and selected sites within 
Kern County. Using manual analysis techniques, multiphase sampling, 
stratification and multi temporal Landsat (19 March, 30 May, 26 June) 
techniques were developed to estimate the proportion of small grains within 
the Sacramento Site. The regional estimate was again summarized by county 
with 13.35% (264,005 ha, 652,348 A) estimated to be small grains. The 
relative error of this estimate was 6.26% or + 8.09% at the 90% level of 
confidence. Within Kern County, a mapping task to detect and map small 
grains was done on the Wheeler Ridge-Maricopa Test Site. Results of this 
task indicated chat 85% of the small grains fields were correctly mapped. 

Additional crop specific estimation and mapping was done for a variety 
of crops including alfalfa, sugar beets, cotton, tomatoes, safflower, melons, 
lettuce and fallow. Using multidate Landsat, per class accuracies averaged 
71% in Kern County; when grouped in water consumptive use classes, accuracies 
increased to 84%. Safflower mapping in the Sacramento Valley Site averaged 
92% correct; acreages, however, were very low in 1976. 

A final task of the Agricultural Water Demand Project was initializing 
the definition of parameters for regionalizing the state by defining the 
varied agricultural regimes of California and by determining the typical 
signature of these varied areas based on photomorphic as well as cultural 
and physical factors. The final definition and production of the regional- 
ization were completed during the first year of the Applications Pilot Test 
(APT) which began in November 1977 and ended December 1979. 


2.3 IRRIGATED LANDS ASSESSMENT FOR WATER MANAGEMENT - APPLICATIONS 

PILOT TEST (APT) (1 November 1977 - 31 December 1978) 

The first year of the APT (November 1977 - December 1978) focused on 
constructing a framework for a large scale demonstration and technology 
transfer. The specific objectives of the first year were to: (1) produce 

a general regionalization of the state (briefly described above), (2) extend 
the stratification, described above, to the San Joaquin Valley, (3) continue 
development and demonstration of techniques for DWR-defined interests in 
several environmentally different areas, ?nd (4) develop and conduct 
technology transfer sessions for the DWR user group. 

To monitor the diversified agriculture of California on a statewide 
basis using Landsat remote sensing techniques requires the definition of 
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regions where this approach Is applicable and where similar techniques 
may be used. A variety of environments may require a variety or at 
least a set of remote sensing techniques. The choices considered for 
this project are Landsat, aerial photography and field work In various 
combinations and with a variety of temporal permutations. The number of 
days an area Is obscured by cloud cover, the spectral properties of the 
surrounding native vegetation and the size and shape of fields, espe- 
cially as affected by surrounding topography, all Impact the choice and 
optimization of the procedural steps. Those areas with the highest 
target-to-background contrast (either In the spectral, spatial or tem- 
poral dimensions) are most amenable to satellite-based remote sensing, 
while those areas with little contrast presently require a greater de- 
pendence on higher resolution aerial photography or field work. 

A second criteria for regionalization is both monetary and time costs. 
For this, areas within the state where the natural environment has 
historically restricted agricultural development were defined as “mini- 
mal crop land zones." This Is not to say that remote sensing has no 
agricultural application in these areas, but generally speaking, the 
level of agriculture In these areas Is so low that the major cost would 
be the locating of small agricultural areas. On the other hand, where 
areas of significant size, potential or Importance exist, it may prove 
cost effective to periodically monitor for change detection. 

A total of twelve regions of major significance to remote sensing 
assessment of croplands have been defined. Seven of those regions have 
areas of significant agricultural importance. The rest, because of climate, 
topography and lack of good soils, have much less water resource signifi- 
cance and are classified as minimal cropland zones. 

In addition to extending the stratification described earlier to the 
San Joaquin Valley of California and conducting two technology transfer 
sessions, a number of OWR-defIned special 1i\terest tasks relating to water 
use were studied. Although at this point these studies do not relate 
directly to multiphase estimation of irrigated land, as the APT moves into 
providing OWR with more detailed water management Information studies 
such as these will be increasingly pertinent. There were four 29,948 
hectare (74,000 A) test sites; one located in each of the OWR districts. 

In the Northern District, OWR was specifically Interested In studying 
the feasibility of estimating the acreage of land under cultivation for 
rice early In the growing season. Rice Is a high value crop ($167,666,000 
in 1977) , demands a large amount of water (5-7 acre-feet/year), and 
the area under production varies considerably, 124,600 - 212,500 hectares 
(308,000 -525,000 A). If actual rice acreage Is less than anticipated, the 
water which was planned for use early in the season (for initial flooding) 
and that had been obligated to maintain the surface level throughout the 
season would then be available for transport and sale elsewhere In the 
state. Because of this potential water supply, an accurate, early esti- 
mation of rice acreage would be very useful to the Department. 

Two relatively unique traits governing the location and growth of rice 
need to be exploited when using Landsat to make an early estimate of 
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acreage: (1) rice cultivation is confined to areas that are underlain 

with an impervious subsoil and are generally not suitable for other 
crops, and (2) the fields are flooded prior to planting and water is 
visible until the canopy obscures it. The importance of soil type is to 
enable a stratification of the land into areas where rice production 
dominates and a “bare soil" signature very early in the season (March 
to mid-April) can be labeled as rice with some degree of confidence. 

Initial flooding, which is a more secure indication of rice cultivation 
takes place in late April through mid-May. 

Seven single date and three multitemporal combinations of Landsat 
color composite imagery were tested. Working with enlarged imagery 
(1:150,000), 184 dots were randomly located and the analyst was required 
to label each dot as falling in a rice or non-rice field. To aid in 
identification, multitemporal full frame Landsat color composite trans- 
parencies, the general schedule of rice operations and 1978 adjusted 
crop calendar were available to the analyst. 

For purposes of statistical analysis, the area was divided into four 
equal size test cells. For each date and date pair, fields interpreted 
were compared to ground data for omission and commission errors. Both 
sets of data on percent omission and percent commission error were 
transformed using the Arcsine Transformation (Sokal and Rohlf 1973) and 
then were analyzed using a one-way ANOVA test for significant differences 
between dates and date pairs (Sheffee 1959),. When considering omission 
and commission errors the best date combination for an early estimate 
was 12 May - 30 May (percent omission error = 1.3S; percent commission 
error = 2 . 2 %). On this pair, the signature given by the flooded fields 
was the key to accurate analyst labeling. 

For this test site a number of general remarks and recommendations 
can be made as a result of this analysis: (1) a stratification based on 

soil type and historical rice cultivation as seen on multitemporal Land- 
sat should significantly reduce the area of estimation and provide a 
structure for development of an appropriate sampling design, (2) year 
specific adjustments of the crop calendar are necessary to insure selec- 
tion of optimum Landsat acquisitions, (3) timely receipt of Landsat data 
is crucial to making the estimate functionally useful to DWR, and (4) 
there appears to be every reason to believe that digital analysis of the 
computer compatible tapes could provide an accurate estimate of rice. 

The Central District of DWR has the responsibility to monitor land 
use and certain field activities in the delta of the Sacramento and San Joa- 
quin Rivers. The combination of rich soil, easily available water, 
proximity to the San Francisco Bay Area market and convenient shipping 
Wd to early development of intensive agriculture. One of the greatest 
problems facing the Delta farmer is the seasonal build-up of salt in 
these low-lying soils (approximately 168,000 of the 299,000 hectares 
(415,000 of the 738,000 A) area lies below sea level). 

In order to minimize the effects of salinity or reclaim soils, excess 
water must be applied to carry salts through the soil and below the root 
zone. Leaching requires a sufficient quantity of water for salt removal 
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and the consumptive use of the crop. If leaching takes place during 
the growing season, SO-lOO'i more water than necessary to meet consump- 
tive use requirements must be applied. 

OWR currently monitors the extent of leaching by flying the area In 
light aircraft, visually identifying the fields being leached and loca- 
ting on prepared field maps. In the year spanning October 1975 through 
April 1976, DWR flew and mapped the study area eleven times. Based on 
the location of cloud cover, six paired data sets (Landsat/DWR map data 
pairs) were available to study the potential use of Landsat for detect- 
ing areas of leaching within the Delta. 

Landsat color composite and MSS Band 7 Imagery were studied for each 
of the acquisitions selected. Since standing water normally Is quite 
obvious on Landsat Imagery, It had been hoped that the fields being 
leached would be apparent. Careful study of the Imagery on all the aqul- 
sltlons yielded negative results on the analyst's ability to detect and 
identify areas of leaching. There are a number of reasons that may have 
contributed to Identification problems encountered in the Delta: (1) the 

peat soils of this area are normally very dark, (2) the high water table 
as well as seepage and drainage problems In this area act to keep soils 
consistently moist, (3) the leaching cycle coincides with the winter rain 
season and resulting overall soil wetness, and (4) since leaching takes 
place in the winter, there is a relatively large proportion of land lying 
fallow. Leaching Is the perogative of the individual land owner and a 
bare soil signature is not a reliable indication of past or potential 
leaching activities. 

The San Joaquin and Southern District sites were both studied for 
specific crop type determination. The San Joaquin site is dominated by 
cotton (45«), vineyards (20%) and grain and hay crops (8%). Field crops 
and orchards make up the remainder. Three dates of imagery, 16 March 
1978, 20 July 1978 and 30 September 1978, were selected for analysis. 

Results of the evaluation indicated that; (l) deciduous orchards and 
vineyards could not be consistently identified, (2) native vegetation 
and escaped cultivars represent a confusion class with small grains due 
to similar phenologies and site degradation of some crop fields by sur- 
f:ice drainage and soil salinity, (3) all the major field crops (cotton, 
small grain, hay and pasture) can be distinguished easily with proper 
date selection and (4) the seasonality of the cropping patterns, the 
large size (average 77 ha, 190 A) and regularity of field boundaries 
favorably impact the use of remote sensing techniques. 

The Southern District site was located at the mouth of the Santa 
Clara River on the Oxnard Plain. Due to a combination of rich soil and mild 
coastal climate, the area produces a wide variety of crops throughout 
the year--principal ly citrus, truck crops and avocadoes. Fields remain 
fallow for only short periods with new crops generally being planted 
shortly after harvest. Year-around multicropping results in 2-3 harvests 
for many fields of truck crops. The most significant acreage was in 
tomatoes, lemons, strawberries, dry beans, flowers and nurseries, celery, 
cabbage, bell peppers and lettuce. 
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While selecting a variety of dates throughout the year was the 
primary criteria for Landsat Image collection, the presence of coastal 
fog limited the options. Four dates in 1978 were ordered: 24 March, 

8 May, 19 July and 29 September. Because of heavy fog on the midsummer 
date an Image form the previous year had to be substituted. 

Examination of the Imagery revealed numerous Interpretation problems 
arising out of crop characteristics: (1) approximately 40" of the site Is 

used for growing vegetables, (2) nearly 30 different vegetable crops are 
grown In this area, (3) 80% of the vegetable and truck crop fields are 
multicroppud with both double and triple cropping occurring, and (4) field 
sizes were generally small (average 4 ha, 10 A) and irreqular In shape. 

Some general conclusions based on the examination are: (1) specific vege- 

table crop Identification will be difficult both because of the number of 
crops and high level of multi cropping, (2) vegetable class Identification 
may be possible with proper date selection, (3) lemon orchards are easily 
Identified, (4) strawberries, because of their stable spatial distribution 
and growth throughout most of the year can be Identified and (5) remote 
sensing can serve as a technique to monitor urban encroachment into croplands. 


2.4 SUMMARY 


By the end of 1978, a number of issues critical to DWR had been studied. 
Of primary concern was the development of a technique by which DWR could 
produce a statewide estimate of irrigated land using manual analysis of 
Landsat imagery. Based on earlier work in the ten county study area and the 
Sacramento Valley test site, a basic methodology for producing the estimate 
has been formulated. In general, the recommended procedure would include: 


. Manual analysis 
. Regionalization 
. Multi temporal Landsat 
. May 

. July/August 
. September/October 
. Multiphase sampling 
. Stratification 
. Regression 


Secondary to the manual analysis of irrigated land, preliminary work on the 
use of digital analysis for estimating and mapping irrigated land was begun 
on a limited test site basis. Demonstrations of the use of manually inter- 
preted multi temporal Landsat for estimating and mapping specific crop types 
had also been completed by the end of 1978 (small grains in the Sacramento 
Valley and Kern County; a variety of field and orchard crops in Kern, Tulare 
and Ventura counties; and rice in Colusa County). 


The results of the earlier projects provided the foundation upon which 
the tasks for 1979 were based. In the following sections the objectives, 
procedures and results of the work done in 1979 will be discussed. In general, 
work was divided into four major categories; (1) manual analysis of irrigated 
lands, (2) digital ?*>alysis of irrigated lands, (3) crop type analysis and (4) 
supporting sampling design. The remainder of the report will address each of 
these topics in detail. 
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3.0 


1979 - OBJECTIVES 


Towards the end of 1978, a number of meetings were held between 
the Department of Water Resources, NASA/ Ames and the University of 
California to establish task goals and test sites for 1979. Through 
this cooperative effort, a basic structure Incorporating four Landsat 
data analysis tasks and one sampling design task was constructed: 

• Specification of sampling design 

• Task I - Estimation of Irrigated land using manual analysis 

techniques 

• Task II > Estimation/mapping of Irrigated land using digi- 

tal analysis techniques 

• Task III - Estimation/mapping of crop type using manual 

analysis techniques 

• Task IV - Estimation/mapping of crop type using digital 

analysis techniques. 

Using 1979 Landsat data, the principal objective of Task I was to 
estimate the total Irrigated area of the state of California. Using 
the basic manual analysis methodology developed during the previous pro- 
jects, this task was designed to test the operational feasibility of 
producing an accurate estimate of Irrigated land over a large area 
(MO, 470, 000 ha ['v-100,000,000 A]), In one year's time (exclusive of 
planning) and at a reasonable cost. This task dominated the 1979 
effort, requiring approximately 40% of the available resources in addi- 
tion to a majority of the sampling desiqn output. Task II {^20% of 1979 
effort) had two major study topics: (1) to Investigate potential pro- 

cedures and associated accuracies with the registration of multitemporal 
digital Landsat data, and (2) to test various classifications procedures 
for digital estimation and mapping of Irrigated land. Task II used two 
major test sites, two 1° blocks In the Sacramento Valley and three 7.5' 
quadrangles in Kem County (see Figure 3-1). Two additional tasks were 
designed to study crop type IdentiHcatlon and mapping using manual 
(Task III) and digital (Task IV) analysis. In practical operation. 

Tasks III and IV were treated together (^-20% of 1979 work) with greater 
emphasis put on the digital analysis. The test sites used for crop 
type work were a 1° block in the Sacramento Valley, three 7.5' quadran- 
gles in Kern County (San Joaquin Valley) and two 7.5' quadrangles In 
Ventura County (south coast) (see Figure 3-1 for location of the test 
sites). The final task for 1979 was created to outline sampling design 
questions for all the tasks and specify a system to be used for the 1979 
statewide estimation demonstration (Task I). The Task I work dominated 
the sampling design effort during the course of 1979, although Increas- 
ing attention to the other tasks occurred at the end of the calendar 
year. In this area, as In the data analysis phases. Increasing effort 
will be put on the digital analysis of Irrigated land and crop type work 
as the project progresses In the coming year. 


PRECEDING PAGE BLANK NOT FIlMc.'D 
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4.0 ESTIMATION OF IRRIGATED LAND USING MANUAL ANALYSIS TECHNIQUES ( TASK I) 


Successfully producing a highly accurate, repeatable estltnate of 
Irrigated land over a state as large as California requires the Integration 
of a variety of components. Based on the experience gained In the two 
previous projects a set of five suc-tasks was defined to guide the 
processing of the data from the Initial definition of Information require- 
ments to production of the estimate. In Figure 4-1. the analysis sub-tasks 
were organized as follows: 

. Design and sample allocation 
. Stratification and sample frame construction 
. Landsat measurement 

. Medium scale photography and ground measurement 
. Estimate summary, evaluation and report 

For presentational simplicity the remainder of the Task I description will 
generally follow the five major sub-tasks shown on the analysis flow. 


4.1 DESIGN AND SAMPLE .^ILOCATION 

Specifying the Inventory design required addressing several key Issues: 
(1) defining the Information required by the California Department of Water 
Recourses; (2) generating a data set to be used as a preliminary population 
model to test and refine the previously used estimation system; (3) applying 
statistical techniques (Monte Carlo) to the data set to simulate model 
performance; with the simulation testing various mathematical models, evalu- 
ating the stratification scheme and determining expected sample sizes for 
hydrologic basins; (4) specifying the matnematical model, stratification 
procedures and sample frame for the 1979 Inventory; and, (5) computing the 
actual sample allocation. 


4.1.1 Definition of Information Requirements 

A necessity In any project Is to strictly and accurately define Infor- 
mation requirements. This procedure demands frank appraisal by the user 
agency as to what Is really needed and a straightforward explanation of 
what can be expected from a particular remote sensing system. Certain funda- 
mental questions designed to carefully define DWR's Information needs were 
posed. These questions, and the responses provided by DWR, formed the base 
upon which the Task I design was built. Table 4-1 briefly summarizes those 
questions and responses. 
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Figure 4-1 (cont'aj 
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® MEDIUM SCALE PHOTOGRAPHY AND GROUND MEASUREMENT 



Figure 4-1 (cont'd) 
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TASK I: ANALYSIS FLOW 
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Table 4-1. Design requirements for the statewide estimation of irrigated 
land. 


. Type of information? 

. Estimation of the proportion of 
irrigated land 

. Areas of summary? 

. Hydrologic Basin (10) 
. County (58) 

. State 

. Time? 

. Inventory data summary within one 
year, exclusive of planning phase 

. Accuracy? 

. Estimate precision control at 
hydrologic basin level 


. True value of proportion irrigated 
to fall within + S% of estimate 
95 times out of'lOO 

. Cost? 

. Not formally specified, but in the 
range of 1 to 2 cents per agricultural 
acre 

. Technology constraints? 

. Must be implementable by current DWR 
personnel and processing capabilities 


4.1.2 Generation of Test Data Set 


Once inventory information needs were established and understood by 
all project participants, the sample design phase progressed to the next 
logical step; evaluating the previously used estimation procedures and 
developing an improved system to meet the refined and updated inventory 
objectives of OWR. This evaluation process addressed three major areas of 
the previously used systems: 


• the form and performance of alternative 
sample system estimators 

• the effect of stratification on sampling 
error, and 

• the preliminary computation of sample 
size for planning purposes 
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Statistical data collected and analyzed for the 14>County Study was used to 
address these issues. As described in Section 2.2« 1830 Phase I (Landsat) and 
141 Phase II (grd/pi) sample units were selected, allocated by county and 
and interpreted for estimating the proportion of land irrigated in the 14-County 
Study area. As the sampling system was a multiphase design, the 141 Phase II 
units were a locational ly-matched subset of the 1830 Phase I units. Table 4-2 
shows the proportion irrigated measurements for the matched pairs of sample 
units. (A complete description of the fourteen county study from which this was 
derived is given by Wall, Tinney et al, 1978). Using this paired data, Monte 
Carlo simulations were performed to address the three major points listed above. 
A detailed description of the methodology and results of the Monte Carlo tests 
are given in the following section. 


4.1.3 Monte Carlo Simulation 


Using the population of locationally matched pairs of Landsat and ground 
data described above, a stochastic technique was used to re-examine three facets 
of the previous multiphase sampling designs by: (1) testing an alternative 

mathematical estimator to the regression type used to link estimates made at the 
various phases; (2) examining the value of the stratification (designed to 
control measurement error) for controlling sampling error; and (3) computing the 
approximate number of sample units needed to achieve a percent standard error of 
+ 5% at the 95* level of confidence within a hydrologic basin. 


Testing Model Alternatives 

The statistical technique used to test various alternatives to the 
regression estimator linking the Landsat interpretation and ground data phases 
was a Monte Carlo simulation. Being stochastic, Monte Carlo is a process 
whereby a random sequence of observations (or samples) can be drawn from a 
population in a repeated fashion. By drawing a large number of samples of 
variable size from a target population, useful statistics and distributions of 
that population can be evaluated under various sampling scenarios. 

For the current study, the Monte Carlo simulation was used to test the 
relative performance of two estimators: regression and biased ratio. The biased 

ratio was evaluated as an alternative since this estimator exhibits lower 
variance under certain conditions. The form of these estimators, including their 
variance estimators, is given in Table 4-3. 


Table 4-2. 


Locationally matched pairs of Landsat and ground data measurements 
of proportion of land irrigated used to compute the estimate of 
irrigated acreage. This data was used in the 1979 design phase 
for evaluating alternative estimators, studying the effects of 
stratification and estimating sample size. 
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.779 


CCO-1 

.955 

.356 


a^-2 

.329 

.688 

Sunn 

SU-i 

.616 

.629 


€U-2 

.525 

.563 


SU-1 

.507 

.859 


6L2-2 

.660 

.682 


6U-1 

.625 

.812 


SL5-2 

.398 

.90S 


a«-i 

.358 

.375 


aA*2 

.781 

.823 


6LS-1 

.712 

.910 


0.5-2 

.884 

.7U 


616-1 

.708 

.927 


6L6-2 

.776 

.360 

PUCCA 

PU-1 

.036 

.027 


m-2 

.055 

.067 


PU-l 

.040 

.036 


PU-2 

.120 

.191 


PO-1 

.215 

.173 


PL3-2 

.200 

.190 


PlA-1 

.784 

.782 


PLA-2 

.621 

.722 

Sacaancnto 

SAl-1 

.081 

.112 


SAl-2 

.085 

.rs 


SA2-1 

.085 

. 0:1 


SA2-2 

.196 

.222 


SA2-3 

2S6 

.220 


SA2-5 

.571 

,627 


SA2-5 

.583 

.525 


SA3-1 

All 

.as 


SA3-2 

.835 

.807 


SA3-3 

.886 

.548 


. SA3-^ 

!i36 

.198 


SA3-5 

.908 

.584 


SA4-1 

.858 

.540 


SA4-2 

.868 

.566 


San Joaquin 


Shasta 


SoiONO 


Surm 


TtHANA 


TokO 


Yuia 


ORIGINAL 
OF POflR 


SAM 

.S43 

.552 

SJl-2 

•202 

.163 

SJ2-1 

.713 

.733 

SJ2-2 

.970 

.982 

SJI-1 

.858 

.997 

SJ3-2 

.353 

.878 

SJ3-3 

•836 

.917 

SJ3-8 

.356 

.968 

SJ3-5 


.335 

SJ3-6 

.932 

.671 

SJ3-7 

.773 

.392 

SJ8-1 

.539 

.620 

3J8-2 

.535 

.973 

SJ8-3 

.826 

.990 

3JS-1 

.971 

.965 

SJS-2 

.358 

.318 

SJS-3 

.756 

.80S 

SJ5-8 

.508 

.839 

3J5-5 

.962 

.391 

SJ5-6 

.962 

.958 

$J6-1 

.954 

.591 

SJ6-2 

.877 

.375 

SH2-: 

.150 

.173 

SH2-2 

.8C3 

.222 

SH3-2 

.527 

.806 

SOM 

.031 

0.000 

SOl-2 

Q.0OC 

0.000 

S03-1 

.912 

.971 

S03-2 

.874 

.883 

S03-3 

.923 

.993 

S08-1 

.an 

.880 

S08-2 

.618 

.820 

S05-1 

.928 

.8U 

SOS-2 

.359 

.939 

SU2-1 

.S60 

1.000 

SU2-2 

.362 

.353 

SU3-1 

.984 

.395 

SU3-2 

.804 

.609 

SU8-1 

.985 

.809 

SU8-2 

.900 

.956 

SU8-3 

.608 

.623 

SU8-8 

.796 

.982 

TH-l 

.569 

.820 

TU-2 

.022 

.038 

TE2-1 

.165 

.322 

IE2-2 

.383 

.799 

TE2-3 

.377 

.328 

TD-1 

.859 

.726 

TU-2 

.858 

.380 

TtS-3 

.631 

.528 

TE5-8 

.660 

.718 

YOM 

.015 

.028 

m-2 

.035 

.053 

Y02-1 

.UO 

.571 

V02-3 

.575 

.339 

Y02-8 

.698 

.955 

Y03-1 

.758 

.380 

Y03-2 

.951 

.939 

Y03-3 

.372 

.998 

Y03-8 

.682 

AT? 

Y03-S 

.976 

.970 

Y08-1 

.993 

.929 

Y08-2 

.706 

.909 

Y05-1 

.655 

.531 

Y05-2 

.983 

.990 

YUM 

.258 

.972 

YUl-2 

.437 

.515 

YU3-1 

.670 

.613 

YU3-2 

.638 

.615 

YU8-1 

1.000 

.909 

YU8-2 

.335 

.379 

YU5-1 

.371 

.931 

YUS-2 

.683 

.767 
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Table 4-3. The Monte Carlo simulation was used to compare Regression and Ratio 
(biased) estimators. Based on Monte Carlo results, SRS and Ratio (unbiased) 
estimators were deterministically evaluated as possible alternatives. 


Simple Random Sample (SRS) (Cochran 1977:18), ; 

Y=y (1) 

V(V) • (i - »y ('») 

Regression (Cochran 1977:189) : Y = bX + (y - bx) (2) 

v(r) = (^ - ^) 0 + 0 - 0 -) ( 2 a) 


Ratio - unbiased (Goodman and Hartly 1953) : 

Y = r X + (y - r 7) 


V(Y)mM) (a^^R^v 2 Rcov^^ . 


(3) 

(3a) 


Ratio - biased (Cochran 1977:150) : 


Y = (y/x) X 


V{Y) = (1 - ^) + RV - 2RC0V,J,) 


(1 + ^ ^ '^xx ' ^^xy .) 

n n '■ Cyy + 


(4) 


(4a) 


Table 4-3 (continued) 
where: 


Y 

V(Y) 

N 


n 

7 

X 

X 

f 


R 


P 


o 


cov 

cov 


xy 

xp 


C 

C 

C 

b 


xy 

XX 

yy 


estimate of true proportion Irrigated (Y) 
estimate of variance of Y 
population size 
sample size 

sample mean proportion irrigated for ground data (y^) 
population mean proportion Irrigated for Undsat data (x 
sample mean proportion Irrigated for Undsat data (x^) 
sample mean for ratio y^/x^ * r. 

true ratio of Y/X • Y/X 

sample correlation between x^ and y^ 

sample variance of proportion Irrigated for undsat data 
sample variance of proportion irrigated for ground data 
sample variance for ratio y^/x^ * r^ 
sample covariance of and y^ 

sample covariance of and r^ 

C0Vjjy/(x y) 

regression coefficient of y on x 
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Using the population of 141 matched pairs of Phase I and Phase II proportion 
measurements (Table 4-2), a large number of samples were drawn by stratum for 
Monte Carlo simulation. In addition to calculating a set of basic statistics by 
stratum, the performance of both the regression and ratio estimators was tested 
by computing the average bias and sampling error. Performance was tested at both 
the stratum- and 14 County-level for various sample sizes. The general description 
of the simulation is shown in Figure 4-2. 

A summary of the results of the Monte Carlo simulation are tabulated In 
Table 4-4. Three levels of simulation are shown: Individual stratum, selectively 

aggregated strata, all strata combined. For each stratum and combined strata, the 
average bias and sampling error were tabulated with confidence limits specified 
for each level simulated. The average bias and sampling error were calculated 
using the formulas: 

% 

average bias = ^ 


sampling error 

V m - 1 

where, Y * estimate of true proportion Irrigated (Y) from either the 

ratio or regression estimator 

Y = ground truth based on DWR-collected data during the 
14-County Study 

m = number of Monte Carlo Iterations (50) 


As the sensitivity of both the ratio and regression estimators was based on 
the expected relative bias and sampling error, two important observations can be 
made by reviewing the tabulated results. The first observation is that strati- 
fication appeared not to have significantly reduced sampling error. This obser- 
vation will be addressed in more detail below. The second observation is the 
apparent lack of any significant difference between the two estimators' performance 
as exhibited by the very similar values of average bias and standard errors of the 
estimate at both the 95* and 99* levels of confidence. Except for small sample 
sizes, the regression estimator exhibited lower bias and variance than did the ratio 
estimator. Though the regression estimator was judged superior to the ratio over 
most strata, the results of this Monte Carlo did not clearly indicate which 
estimator, if either, was the better to accomplish the objectives of the state-wide 
inventory of irrigated land proportion. Based on these Monte Carlo results, a more 
in-depth analysis of other mathematical estimators was performed. 

Using the same data set from the 14-County Study, two additional estimators were 
evaluated with the ratio (biased) and regression forms. These two estimators, the 
simple random sample (SRS) and Ratio (unbiased), are shown in Table 4-3. The 
performance of all four estimators was evaluated deterministically by predicting 
ground variance (c^) and correlation between Landsat and ground data (o-). 


STRATUM ] 
n' • 22 


Stratum 1 
dryland 
n' - 22 

Stratum 2 

field crops 
<40 A 
n' « 28 


Stratum 3 
40 - 79 A 
n* » 37 


Stratum 4 
field crops 
80+ A 
n* • 30 


Stratum 5 
orchards/vin- 
yards <40 A 
n' « 20 

Stratum 6 
orchards/vin- 
yards 40+ A 
n' « 4 




BASIC STATISTICS 
+ 


RATIO i REGRESSION 


estimates* varianco 
deviation from truth 
with standard deviation 


SUMMARY FOR ALL 
50 SAMPLES 


Figure 4-2. General description of the Monte Carlo simulation. The 141 matched 
pairs of Landsat and ground data from the 14-County study were 
broken into their six strata. From each stratum (stratum 1 shown 
as an example) samples were repetitively drawn (50 times) to gen- 
erate estimates for various sample sizes: 5, 10, 15, 20.... (one 
sample of size 5 is shown as an example). 




























































Table 4*4. Summary of selected results of the Honte Carlo Simulation used for Task I. 
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By determining the variance of the estimators using variable sample sizes, the 
relative performance of each estimator could be evaluated. That estimator 
exhibiting the lower variance for given sample sizes would be preferred for the 
$tate>w1de Inventory. These estimators were chosen for performance evaluation 
after an extensive review of the statistical literature. Under certain conditions, 
the estimator tor SRS and/or ratio (biased) estimation could achieve a smaller 
variance for a given sample size (Wensel, 1977). An alternative ratio estimator, 
described by Goodman and Hartly (1958), was selected because It Is unbiased, and 
provides a good comparison with the previously evaluated biased ratio estimator. 

The performance of the estimators Is graphically displayed In Figure 4-3a-d. 

Three of the estimators (SRS, regression, and unbiased ratio) were compared to the 
biased ratio having an assumed standardized variance of 1.0. Variances were 
calculated for each of four strata (1, 2, 4, 5-6) with sample sizes ranging from 
2 to 25, Including an estimation of variance for very large sample sizes (n-> »). 
The variances were standardized by dividing by the variance of the biased ratio 
estimator for the corresponding stratum and sample size. 

By examining variance plotted against sample size (Figures 4-3 a-d). It can be 
seen that the regression estimator was superior to all other estimators for large 
sample sizes (n ^ 5). The SRS estimator was consistently Inferior; at all sample 
sizes SRS standaFdIzed variances greater than 3.0 were frequently observed but not 
plotted on the referenced figures. For small sample sizes both ratio estimators 
are superior to regression but Indistinguishable from each other. Because the 
standard error of the biased ratio estimator was at most IB** less than that of the 
unbiased ratio estimator, and given the advantages of using an estimator with no 
bias, the unbiased ratio estimator was used for small sample sizes. For operational 
application of these two estimators In a variable sample size environment, specific 
decision rules must be established for sample size computation when using either 
estimator. Table 4-5 lists the range of sample sizes (n) over which the unbiased 
ratio and regression estimators are used. 

Table 4-5. Range of sample sizes (n) over which the unbiased ratio and regression 


estimators are used. 



Stratum 

Unbiased Ratio 

Regression 

1 

n 1 3 

n 1 4 

2 

n <_ 10 

n 111 

3 

n 1 5 

n 1 6 

4 

n L 5 

n i 6 

5 

n t. 6 

n 1 7 

6 

n 6 

n 1 7 

7 

n 5 

n 1 6 


Note: Strata 3 and 7 were not used In the 1976 study but were part of the 
current Inventory. Too few observations existed for Stratum 6; units 
In this stratum were combined with Stratum 5 for computational purposes. 
The values given for Strata 3, 6, and 7 are based on the results of those 
strata with the most similar characteristics. 
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Fi«jure 4-3. SunJ«irdized variance as a function of sawple size for the SkS, ratio (biased afW utdMased) atx) regression 
estimators. This data was used to select the Mtheaidtical Model used to link the pliases and to define the 
sapf>le sizes over which they %^uld be used. 









Ihe use of a ratio estimator does present one technical problem: the calcu- 
lation of r^ « y^/x^ when either or both and are zero. In such cases the 
following contingency table Is used: 


N^i 

• 0 

f 0 

■ 0 



t 0 

r, • 1 - K, 

r, • y,/x, 


In summary, four estimators were evaluated using both stochastic and deter- 
ministic tests to ascertain which estimators generated the lowest variance over 
the range of sample sizes needed for a state-wide inventory system. Both the 
previously used regression and a newly tested unbiased ratio estimators were 
considered to be the best estimators as they exhibited the lowest variances. 
Depending on the stratum samole size, as shown In Table 4-5, the appropriate 
estimator will be used to calculate the proportion of Irrigated land. 


Effect of Stratification on Sampling Error 

The second objective of the Monte Carlo tests was to evaluate the agricultural 
practice stratification used In the Sacramento Valley fourteen-county site. Since 
stratification can potentially reduce variance without Increasing total cost 
(Wensel, 1977), a careful examination of the results of the stratification was 
warranted. An additional sub-task was also undertaken to study the way In which 
regression estimators Interact with stratification. 

A stratification based on land use and field size had been used for the 
Sacramento Valley estimation (Section 2.2). It was designed to control measurement 
error associated with the Interpretation of agricultural environments that vary 
considerably In "ease" of Interpretation and accurate line placement. The major 
purpose of the Monte Carlo simulations In this Instance was to evaluate the utility 
of the 14-County stratification in reducing sampling error as well as measurement 
error. 








Ihe regression estimator used In Task I enables a small amount of costly 
data (1.e.« ground survey) to be used In conjunction with a large amount of 
less costly data (I.e., remotely sensed) In a way that the ground can correct 
for bias In the remotely sensed; while Landsat can compensate for the small 
ground sample size, thus reducing sampling variance (Wensel, 1977; Thomas, 1979). 
Combining such an efficient estimator with a potential error reducing stratifi- 
cation method seemed advantageous. When the stratification and regression esti- 
mator were combined In the Monte Carlo simulations, however, stratification did 
little to reduce sampling variance. Ihls minimal effect on variance can be seen 
by evaluating the results shown In Table 4-4. Those results Indicate that In 
neither of the two stratification levels simulated (6-strata, and selectively 
combined strata) was there a significant difference In variance compared to the 
case where no stratification was used. Since these results would have significant 
Implications for this and future studies, further Investigation of the regression 
estimator variance equation was warranted. 

For the unstratified case, V(Y)un, the variance formula Is (Refer to Table 4-3 
for notation): 




While for the stratified case, V(Y)j.; we have: 




Consider the term (1 • p^) cp In Isolation: 


0 - ■ (1 - • (1 - 


/ESS^ _ /MSE(n-2)^ .. 


HSE (i^f) 


(7) 


(8) 


Where, 

RSS » regression sum of squares 
ESS » error sum of squares 
TSS * total sum of squares » RSS + ESS 
MSE * mean square error 

W^. » proportion of basin composed of stratum 1 
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Substituting back into the original variance equations and making the following 
assumpti ons : 

1) the strata are all the same size, thus « N; 

2) the strata all have the same mean square error, thus MSE^ = MSE; 

3) from 1) and 2), optimal allocation would give n^. ■ n; 

4) from 1) it follows that = 1/N 

5) the cotal population size for the unstratified case is the product of 
the number of strata times the strata size = kN; 

gives the new unstratified variance: 

- Iff) * xH^) (XHif) (9) 

and the new stratified variance: 

''(’' 13 ' 4 - ijf) (1 - i^) (^) MSE3 ‘’“) 

Comparing the two variance equations shows that for small sample sizes the 
unstratified variance can be lower because: 

0 * xH^) < 0 » 


for large n, both these terms approach unity and the only difference between the 
two equations is the value of and MSE^. Thus, stratification will signifi- 

cantly decrease the variance only if MSE^ is significantly less than MSE^^. This 

will only occur if the stratified regressions are a significant improvement over 
the single unstratified regression. This may not be the case in either of Tasks I 
or II. Because the Landsat estimate and the ground estimate for each sample unit 
tend to be equal, the .egressions all tend toward a slope of 1.0 and an intercept 
of 0.0. 

Thus in practice, stratification may not give a significantly better fit, 
and consequently may not give a significant reduction in variance. Stratification 
will help if strata can be identified that have different biases. After reviewing 
the Monte Carlo results, the stratification was redesigned as described in Section 
4.1.4 in hope of achieving differing regressions. 


Preliminary Sample Unit Computation 

The third major function of the Monte Carlo simulation was to compute the 
approximate number of sample units that would be needed to achieve the stated 
accuracy requirements (+5% at the 95°i confidence level for each hydrologic 
basin). As OWR was responsible for collection of ground (Phase II) sample data, 
the preliminary computation of sample size was to provide a guideline for 
planning OWR manpower requirements. Computation of the final sample size for 
the operational inventory is discussed in Section 4.1.5. 
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For each sample size (n»5, 10, 15, 20...) used in the Monte Carlo 
simulation, the number of samples (n*) that fell within S% and 10% of the true 
estimate was determined. This number was converted to a percentage by dividing 

by the number of cycles (m) and multiplying by 100 X 100). The results are 

illustrated in Figures 4-4a to 4-4h. Preliminary sample sizes were predicted 
from these graphs by: (1) stratum, (2) selectively combined strata, (3) no strata, 

(4) hydrologic basin, and (5) state. Based on this preliminary analysis a 
maximum number of 80 units (Table 4-4) per hydrologic basin (800 units for the 
state) was used by OWR for planning the allocation of manpower. 


4.1.4 Specification of the Mathematical Model, Stratification Scheme and 
sWle Frame 

The Monte Carlo simulations described in Section 4.1.3 provided the information 
needed to refine the mathematical estimators used to link multiphase measurements 
for producing the Task I estimate of irrigated acreage. The simulations also 
indicated that modifications to the stratification scheme would be necessary if the 
stratification was to be used to reduce sampling as well as measurement error. The 
sampling frame remained similar to that used in the previous studies (cluster sample 
units, 1.6 X 8.0 kilometers in size [1x5 miles]) although work by Arno (1979) and 
UCSB (Appendix I) offered alternatives for further investigation. 
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Specification of the Mathematical Model 


As in the previous studies, the primary equations (estimators) used to link 
Landsat and ground area measurements to produce estimates of irrigated area were of 
the linear regression type. The general form of these equations, as adapted to the 
irrigated lands problem, was established in the original ten county study (Section 
2.1). In that study, a multiphase sampling scheme (Tikkiwal 1955 and 1967) was 
adapted to using iterative estimators whereby the ground (Phase III) estimator used 
the aerial photo (Phase II) estimator which, in turn, used the Landsat (Phase I) 
estimator. In both the present and the 14-county study, only two phases were 
employed: a census at the Landsat phase (Phase I) and a simple random sample 

within strata at the ground phase (Phase II). 


The estimators are affected by the fact that the sample units are considered as 
clusters and that these clusters are of unequal size. As the clusters were of 
unequal size, accurate measures of the sizes of the individual sample units were 
required so that weighted means could be used in the estimators rather than 
unweighted means. Therefore, the Phase I estimator is: 


Y* 




2. _ “ n* 
i=l M* " 


1 M. 

i=l ^ 


( 11 ) 


The Phase II estimator is: 


Y' 



n' 


ds 





i\Z} 


f 
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where: 

N Population size of units to be sampled 
n* Phase I (LANOSAT) sample size 

n* Phase II (Ground) sample size 

Size of sample unit 1 (any consistent unit of measure) 

R* Mean Phase I sample unit size; R* = 

a^ Irrigated area In sample unit 1 of Phase I 

aj Irrigated area In sample unit 1 of Phase II 

Irrigation proportion In sample unit 1 of Phase I; 

A 

Oy* Sample standard deviation for weighted Phase I observations 

A 

Oy, Sample standard deviation for weighted Phase II observations 

A 

Py* yiSample correlation between weighted Phases I and II 


Note that Equation 12 uses the Phase I estimator Y*. The first term Is 
the weighted Phase II mean and the second Is Its regression correction. The 
regression coefficient Is the term Involving the correlation and the standard 
deviations. It may be seen from this that higher correlations between Phases I 
and II Increase the effect of the correction term (which may be either positive 
or negative). Also, the smaller the Phase II standard deviation Is In relation 
to the Phase I standard deviation, the smaller the effect of the correction 
term becomes. 

The variance estimates are also computed In an Iterative manner. The 
Phase I estimator Is simply the variance of the weighted observations for 
simple random sampling with a finite population: 


VAR (V*) = .2. ( 1, . J ) 


03 ) 


The second phase variance estimator Is; 


VAR 


O'). C(i, -i.) ( 1 * 5 , 43 ) (i-;2,_^,).vas^ (,4, 




This depends directly on the Phase II standard deviation and uses the Phase I 
variance estimate. The variance equation (14) differs slightly from the forms 
used In both the 10- and 14-county studies. The finite population correction 

factor has been changed to ( ^, - ^* )» and a small sample size correction 
factor has been Included; ( 1 + ^ ^ ). 

In the present study, the Landsat area measurements constituted a census 

of the sample unit population (I.e. n* = N). Thus, VAR (Y*) = 0 and Equation 
14 collapses to: 

VAR (r) ■ oj, ( i, 4 ) ( 1 + ) ( ' - ) C5) 

In order to calculate the Irrigation area estimates, a FORTRAN program, 
MPHASE, had been written previously to compute three phase estimates and the 
associated variance estimates. In the absence of a third level of Information, 
MPHASE can be used for two phase estimates as well. In either case, there Is 
the option to combine the observations from different strata for the two phases 
with the least observations In order to obtain more stable standard deviation 
and correlation estimates. The program was designed to use as many as seven 
variables of interest per run, so that variables other than irrigated propor- 
tion (i.e. small grain and safflower proportions) can be estimated. These 
variables need not be input directly. A special FORTRAN subroutine Is used 
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to transform the Input variables Into the variables of Interest. This Is con- 
venient for a project where different measurement procedures may be used as 
Input (I.e. dots counted, grams weighed) and changed to proportions within 
the program. Modifications to the original MPHASE allowed the generation of 
ratio estimators and the use of variable cluster sizes, weighting the propor- 
tions appropriately. 

Specification of the Stratification Scheme 

Based on the results of the Monte Carlo analysis (Section 4.1.3), the 
stratification scheme used for this year's statewide estimation was modified 
(See Section 2.2 for a description of the original stratification). The mod- 
ifications were designed to reduce sampling variance as well as control meas- 
urement error. These new strata were composed of areas that, on Landsat 
1:1,000,000 color composite transparencies, appear to be: 

Table 4-6. Stratification scheme used In the allocation of sample units for 
the Task I estimation of Irrigated land. 


Stratum Number Stratum Description 

1 Generally dry farmed 

2 Field crop areas dominated by fields 

less than 16 hectares (40 acres) In size 

3 Field crop areas dominated by fields 

less than 16 hectares (40 acres) In 

size with known high proportion irrigated 

4 Field crops dominated by fields 16 

hectares (40 acres) or larger in size 

5 Orchards and vineyards less than 16 

hectares (40 acres) In size 

6 Orchards and vineyards 16 hectares 

(40 acres) or larger in s«ze 

7 Unusual agricultural areas 


The procedure used to produce the state-wide stratification Is described In 
Section 4.2, Stratification and Sample Frame Construction. This revised strat- 
ification scheme will be evaluated at the end of the Task I Inventory. 
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Specification of the Sampling Frame 

^ For geographic areas, sampling frames usually are constructed as either 

a point system referenced by coordinates or an arbitrary clustering of areas 
into some convenient size unit (e.g. rectangular areas). The project objective 
as well as statistical and implementation considerations all enter into deci- 
sions which lead to the "optimum” strategy for sampling the population. Photo- 
related variables were (and may be in the future) a major part of the system 
I ‘ either as a separate phase or as an aid to ground data collection. Therefore, 

< the sampling frame should allow maximum use of the photographic capabilities 

for a given expenditure of effort. For this reason, point systems are not 
practical: to photograph a large number of different points with a single or 
? pair of images is very costly. A cluster system is more economical since larg- 

er units allow additional information to be obtained at little incremental cost. 

j Initially, the decisions on sample unit size and configuration were based 

largely on practical considerations as insufficient data existed to simulate 
' and optimize sample unit dimensions for large area inventories in California. 

I A nominal 1.6 x 8.0 km (1 x 5 mi) sampling unit was used for both the 10-County 

(Section 2.1) and the 14-County (Section 2.2) studies because: (1) OWR's 

I ' standard aerial survey photography covers a one-mile wide strip, (2) a five 

I mile length is easily located and flown over several dates, and (3) the north- 

■ south orientation corresponds to OWR's survey techniques. 

I 

\ These same considerations were valid for the present study; thus, the 

I nominal 1.6 x 8.0 km (1 x 5 mi) north-south oriented unit was maintained. 

I Two modifications were made, however. Given the choice during sample frame 

I development of having two small or one large sample unit, the larger unit was 

favored. This was done to decrease the errors due to possible misregistration 
of units when they transferred onto maps and Landsat enlargements. The second 
change was in sample unit orientation. The north-south orientation was main- 
tained in the Central Valley and other agricultural areas where road networks 
were primarily oriented north-south. The sample units in upland areas and 
small valleys were oriented along major landforms and/or main thoroughfares. 

This was done to prevent having a large number of small sample units at the 
expense of having only very few large units, and increasing driving efficiency 
for the ground data collection. 

Alternatives to the 1.6 x 8.0 kilometer sample unit size have been pro- 
posed by research cooperators at UC Santa Barbara and NASA-Ames Research Center. 
When addressing modifications to the area of a given sample unit, many inter- 
related variables must be addressed. In the absence of a large data set, one 
can make certain assumptions how variance and correlation change with sample 
unit size and how costs vary for ground data collection, aerial photography 
and Landsat acquisition, and data interpretation and measurement. Once reason- 
able assumptions are established, projections can be made as to the effect of 
changing sample unit size on estimate accuracy. 

Using a set of reasonable assumptions, personnel at UC Santa Barbara com- 
pared the standard 1.6 x 8.0 km (1 x 5 mi) sample unit size to transect sample 
units approximately 1.6 x 36.3 km (1 x 22.5 mi). They state that in terms of 
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total flight time* "transect sampling may be a cost effective alternative to 
random segment sampling." (See Appendix I) 

Using another set of reasonable assumptions, Amo (1979) compared five 
different sample unit sites: 0.004* 0.040* 0.40S* 2.590* 12.950 km^ (1*10*100 
640* 3200 acres). He states that "for a given cost* accuracy Increases as the 
unit size Increases up to 0.405 km^ (100 acres). It peaks between 0.405 and 
2.590 km^ (100-640 acres)* and accuracy decreases as the size grows to 12.950 
km^ (3200 acres)." 

Further analysis by UC Berkeley personnel of these two separate Investi- 
gations has Indicated any size of unit can be Justified based on a given set 
of reasonable assumptions. Depending on population size (N)* correlation be- 
tween Landsat and ground measurement (p)* cost of Landsat and ground unit 
measurement (C^ and C^* respectively)* variance (o^* which varies with sample 

unit size)* and desired accuracy («)* a convincing case can be made for designs 
ranging from an SRS with one acre sample units to a regression approach using 
long transects. 

With the large data base from this 1979 Task I Inventory* reasonable es- 
timates and ranges for N* p* C|^ and C^* and can be made under various ac- 
curacy constraints to determine the best sampling scheme for future surveys. 

The Task I Evaluation will address these very Issues at the conclusion of the 
Task I Inventory. Until then* the 1.6 x 8.0 km (1 x 5 ml) sample unit will 
be retained as It has proven to be workable. 


4.1.5 Sample Allocation Computation 

As can be seen Irr the analysis flow (Figure 4-2)* the sample allocation 
computation was based on Input from two major sources: (1) the specification 
of the mathematical model, stratification scheme and sample frame and (2) a 
sample unit list summarized by stratum and county. 

Since the sampling design for Task I Included the use of stratification* 
allocating the sample units required the distribution of sample units among the 
strata for each hydrologic basin. The distribution of units could have been 
simply proportional to the relative size of each stratum. Since the 1976 14- 
County Study gave estimates of within stratum variance (o^) and correlation (p), 
the optimum (theoretically giving smallest variance) allocation of sample units 
to each stratum (n^) can be accomplished by minimizing variance subject to a 

cost constraint, as follows: 


minimize: 


(16) 


V(») •,!, W? 0 - «?) - i,) 0 ♦ i^) 

X 

subject to: n (17) 

where: 

V^,Y) <* estimate of variance of the estimate of basin propor- 
tion Irrigated 

X • number of strata 

1 * stratum number 

° proportion of basin composed of stratum 1 

p> » sample correlation between LANDSAT and ground data In 
stratum 1 as determined in the 14-County Study 

0 ^ » sample variance of proportion Irrigated for ground data 
in stratum i as determined in the 14-County Study 

n^ sample size in stratum i 

- population size in stratum i 

M = maximum relative cost permitted ir. basin 

a weighted average relative cost of stratum i 


The values of x, Vl^ and came from summary tables for each hydrologic basin 

(Table 4-7). The basin summary tables were compiled from similar tables con- 
structed for each County (Table 4-8). The information summarized on the county 
table was derived from detailed county sample unit lists that described each 
sample unit in terms of agricultural practice stratum, presence or absence 
of grain and/or vegetables and relative ease of ground access (Table 4-9). 

The constraint function (equation 17) uses the average relative cost of 
ground checking a sample unit in a particular stratum (c^). In the 14-County 

Study all sample units (SUs) were located on the floor of the Sacramento Valley 
and were considered equally accessible. As sample units were allocated over 
the entire state for the 1979 inventory, the assumption of equal accessibility 
was not valid. Therefore, sample units were divided into three accessibility 
categories. Relative cost weights (c^) were then determined for each stratum. 

Appendix II describes the development and use of the ground accessibility cat- 


Basin; Tulare 
County; All 


Table 4-7. Example of a hydrologic basin summary 
table. The number of strata, propor- 
tion of the basin composed of each 
stratum and the population size per 
stratum came from tables like this 
summarized for each hydrologic basin. 


VEGETABLE 



NON VE6 



58 171 

58 171 






ACCESS A 

1.00 



- . 6 

_U14 

C 

.1.38 


1294 58 191 


1294 58 191 

1.00 1.00 1.00 


1.00 

76.02 



















Table 4-9. 


Basin: Tulare 
County: Kem 




Example of a sample unit list generated 
for each county within each hydrologic 
basin. Each sample unit Is described by 
agricultural practice stratum, grain and/or 
vegetables and accessibility. Sample units 
that were ultimately selected for around 
checking were marked with an "E" (Indicating 
a ground visit early In the season) or a /. 
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egorles to predict c^. 

After all terms were defined, a computer algorithm, FCOPAK , was used to 
minimize Equation 16 subject to Equation 17. 

For each hydrologic basin, the total number of sample units was allowed 
to vary over the range of 30 to 200. FCOPAK determined the optimal allocation 

of these units to each stratum (nj. Percent confidence Intervals at 95, 98, 99 
and 99. 9S levels of confidence we^ also calculated for each allocation. These 
percent standard confidence Intervals were then plotted against the total sam- 
ple size. Figure 4-S Illustrates a typical plot using the Tulare Basin allo- 
cation. From these plots, the total number of sample units requlreo to achieve 

at the 95% confidence level was determined by Interpolation. As the stated 
inventory accuracy objective was ± 5 % 9 95% confidence level, this more conserv- 
ative criteria Insured against the possibility that chance alone would cause a 
failure to meet the stated goal In any hydrologic basin. As seen In 
Figure 4-5, the total number of sample units required to meet the £3%% at the 95 
criterion In the Tulare Basin was Interpolated to be 65. This value Is then 
compared to the FCOPAK values bordering this Interpolated estimate (I.e. 62 
and 81 sample units). The FCOPAK stratum allocation within the Tulare Basin 
for the 62 and 81 sample units Is tabulated In Table 4-10. 

To achieve the desired stratum-level allocation of the 65 basin units, a 
second Interpolation was performed using the optimal FCOPAK stratum allocation 
for 62 and 81 basin units. This procedure was used for all the hydrologic 
basins. The resulting allocation of sample units by basin and by stratum Is 
given In Table 4-11. 

After all the sample units were allocated by stratum for each of the hy- 
drologic basins, the units were physically annotated on map sheets for sub- 
sequent ground survey by OWR personnel. Measurement of both the sample units 
on the ground and the Landsat census Is described In the following Sections. 


4.1.6 Summary 

The design process Is a critical element in any Inventory activity. It 
serves to specify the framework for data acquisition, analysis, summary, and 
storage and retrieval. By specifying this framework, all phases of an Inven- 
tory are performed In a coordinated fashion, thus Increasing the probability 
of success^lly achieving the stated inventory objectives. For the design 


FCOPAK (Feasible Conjugate Direction Package for the Solution of Differentiable 
Mathematical Programs) was developed ^ Best (1974) to solve the general problem 
of maximizing a function subject to linear and/or nonlinear constraint functions. 
The program's only shortcoming Is that solutions to are generated In non Integer 

form. This problem was solved by use of the following contingency table: 

If n^ - integer (n^) < 0.1, then • Integer (n^) 

If n^ - Integer (n^) ^0.1, then n^ » Integer (n^) ♦ 1 
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Figure 4-5. Percent standard confidence interval plotted against total sample size 
for the Tulare Basin. 


Table 4-10. Allocation of sample units by stratum. The values were calculated by 
interpolation from the allocation shown in Figure 4-5. 
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Table 4-11. Sample unit allocation by stratum for each hydrologic basin. 

- 


HYDROLOGIC BASIN 

1 

2 

STRATUM 
3 4 

S 

6 

7 

TOTAL 

Central Coastal 

26 

7 

28 

8 


5 

6 

80 

Colorado Desert 

- 

- 

- 

42 

4 

8 

4 

58 

North Coastal 

4 

6 

m 

30 

12 

- 


52 

North Lahontan 

«» 

12 


26 


- 

- 

38 

Sacramento Valley 

8 

10 

m 

39 

5 

4 

6 

72 

San Francisco Bay 

19 

4 

11 

7 

14 

- 
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San Joaquin 

6 

5 

- 

53 

5 

14 

- 

83 

South Coastal 

12 

18 

9 

11 

24 

- 

8 

82 

South Lahontan 

7 

- 

- 

39 



6 

52 

Tulare 

4 

- 

- 

46 

5 

10 


65 

California 

86 

62 

48 

301 

69 

41 

30 

637 



process to be successful, the Interrelationships between data collection, de- 
cision-making, and management must be understood, documented and Integrated 
into the design. Only by understanding these Interrelationships In concert 
with historical Inventory practices can realistic assumptions be made, func- 
tional relationships documented, and operational systems developed and Imple- 
mented. 


The current design effort for the 1979 Inventory had the benefit of close 
cooperation with the user agency, California Department of Water Resources, 
who provided Invaluable management and decision-making insight and critical 
Information needs on a state-wide basis. Furthermore, DWR conducted the com- 
plete ground survey effort providing the sensitive, costly, and compulsory 
ground data to drive the state-wide estimation process. 

Based on both DWR Input and the 10- and 14-County Studies conducted by 
the University of California, Important historical data were available for 
the design process. The experience and the data were used to (1) generate and 
refine assumptions, (2) evaluate various estimator alternatives, (3) evaluate 
the effect of stratification on sampling and measurement errors, (4) calculate 
estimates of variance and data plane (I.e. Landsat-ground) correlations, both 
critical for the sample size calculations, and (5) generate accessibllity/cost 
constraint functions paramount In the sample allocation process. 


After numerous analyses, the Inventory design was completed and Implemented. 
The 1979 design may be summarized as follows: 


GOAL: 


DATA TYPES: 


SAMPLING FRAME: 


Estimate the proportion of Irrigated acreage In 
the state of California to within ±5% allowable 
error at the 95% level of confidence. 

• Multitemporal Landsat color composite imagery en- 
larged to a scale of 1:150,000 (Phase I) 

• Ground data collected by DWR; supplemented with 
35mm aerial photography (Phase II) 

• USGS maps at scales 1:1,000,000, 1:250,000, 1:62,500 
and 1:24,000 

• U-2 color Infrared aerial photography at a scale 
of 1:130,000 and 1 : 24 ,000 

• Sampling frame of area units (clusters) 

• 1.6 X 8.0 km rectangular sample unit 

• Orientation of sample units predominately north- 
south; allowed to vary with local topography and 
road network 
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STRATIFICATION: • Hydrologic basin, county 

• Agricultural practice/land use 

• Small grain and vegetable 

• Exclusions 

fiATHEMATICAL MODEL 

AND SAMPLE ALLO- : • Multiphase design 

CATION 

• Census at Phase I (Landsat) 

• Simple Random Sample within strata/basin at Phase II 
(ground data) 

• Phases linked using regression estimator for large 
sample sizes and an unbiased ratio estimator for 
small sample sizes 


When the statewide inventory is completed, a detailed evaluation of the 
Task I design process can begin. The evaluation will allow further refinement 
of assumptions, sampling frame (including size, shape and orientation of SU's), 
two phase sampling, stratification, sample allocation, and the estimation pro- 
cedure (i.e. equation used to link phases and predict errors; and, procedures 
used to aggregate strata estimates into final estimates). 


4.2 STRATIFICATION AND SAMPLE FRAME CONSTRUCTION 

Stratification is a commonly used technique designed to reduce variance 
by systematically placing boundaries that separate homogeneous units. For 
Task I the major purposes of stratification were to:(l) allow summary of data 
by administrative units (hydrologic basin, county and state), (2) reduce 
sampling and measurement error, (3) enhance the allocation of sample units, 
and (4) flag areas for early and/or multiple ground data collection. The pro- 
duction of three stratifications was necessary to address the purposes just 
described: (1) administrative boundaries were defined by use of a OWR-supplied 
map delineating hydrologic basins and county boundaries were located from USGS 
1:24,000 and 1:250,000 scale topographic maps (Figure 4-6); (2) an agricultural 
practice stratification was developed to reduce sampling and measurement error 
and enhance the allocation of sample units; and (3) areas of small grain and 
vegetable cultivation were stratified to help optimize ground data collection. 
The latter two stratifications will be described in Sections 4.2.2 and 4.2.3. 

As shown on the analysis flow (Figure 4-1), a merged stratification was formed 
that became the basis for the sample unit list required to compute the sample 
allocation. 


Sacramento Valley 



Figure 4-6. Counties and hydrologic basins of California. 



4.2.1 Regionalization of the State of California 


The University of California, Santa Barbara has involved itself with items 
that impact stratification and the subsequent allocation of sample units. This 
has been in response to U. C. Berkeley's work in sample design. 

Our work in stratification began with the definition of 12 photomorphic 
regions in the state based on three criteria: fog/cloud cover; target-to-back- 

ground contrast; and presence/absence of agricultural activities. Regionali- 
zation defines those areas where remote sensing techniques are applicable to 
the task (e.g. 4 interior regions where satellite remote sensing can be used 
were noted as well as 5 regions without agriculture and 3 coastal regions 
where fog will most likely interfere with data acquisition) and those areas 
where similar techniques can be used. 

Subsequent work lead to the definition of subregions, or clusters of 
counties with similar crop mixes. While this information may have its greatest 
value in Tasks III and IV, it was useful for defining those counties with 
"problem crops" such as grains and vegetables that must be considered in Tasks 
I and II. 

This effort was supported by two questionnaires sent to the U. C. Coopera- 
tive Extension Office in each county. The first questionnaire was concerned 
with the acreage, the timing, and the specific crops involved in multicropping 
(i.e. double or triple cropping). Approximately 75% of the counties have 
responded. The second questionnaire was concerned with *,mall grains - the 
amount of irrigated non-irrigated grains, the specific grains involved and 
cropping practices. Approximately 30% of the counties responded. The general 
pattern seen in the responding counties is that most grains are not irrigated 
regularly. Irrigation most often occurs during the preparation stage and 
occasionally once between emergence and maturity. This is highly variable from 
year to year, depending on seasonal rainfall conditions. 

Utilizing the work in subregionalization as well as the questionnaire 
results for multicropping and Landsat color transparencies, multicropping strata 
were defined for the purpose of allocating samples that require early or late 
field visits to detect second crops. This work was done in close cooperation 
with U. C. Berkeley and will be used in addition to their earlier stratifica- 
tion scheme based on field size and crop type (field crop v£. orchard). 

Much of the work done on stratification for Task I will be extended to 
Tasks III and IV. One possibility is the use of DWR land use summaries for 
7.5' quandrangles to define croo mix strata for sample allocation and possible 
a priori classification in Task IV. The data is already in a comouter compatible 
format and could be tested for a small region. It oossibly could be useful for 
defining strata based on the amount of irrigated acreage for Task II. While 
this requires certain assumotions about crop stability over time, the OWR 
land use surveys represent some of the best data available. 

Two other sources of data that may prove valuable for stratification, 
especially Tasks III and IV, are crop maps produced by the Soil Conservation 
Service Statewide Important Farmland Maopina Program and county crop maps 
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found 1n the California Department of Food and Agriculture's Report on Environ- 
mental Assessment of Pesticide Regulatory Programs. These data sources should 
be examined for strata definition for crop Identification. 


4.2.2 Agricultural Practice Stratification 

The agricultural practice stratification developed for the Sacramento 
Valley (Section 2.2) was used as the base for the 1979 work. It was based on 
two general factors that are critical to both manual and digital classification 
of Landsat; land use and field size. When defining land use for the purpose 
of estimating Irrigated agriculture, there are several pertinent factors to 
be examined: (1) the presence/absence of any agriculture; (2) historically 
known or topographically defined areas of dryland vs. Irrigated agriculture 
and (3) variations In agricultural cropping practices within a generally Ir- 
rigated area (l.e. field crops vs. orchards). The problems caused by small field 
size affect the human analyst where detecting and Identifying fields as well 
as accurately drawing boundaries becomes difficult and tedious and to the com- 
puter where the edge effect of mixed pixels and precise registration of ac- 
quisitions Is critical. Before extending the stratifications completed on 
the Sacramento Valley to the rest cf the state, Monte Carlo tests were per- 
formed examining the value of the strata (Section 4.1.3). Based on the results 
of the tests, a modification of the original stratification scheme was used 
(Table 4-6). 

To minimize Interpreter variability, the entire state (approximately 30 
Landsat frames per date) was stratified by a single analyst Into one of the 
seven strata described above. Since the minimum sample unit size was one square 
mile, areas less than that were not delineated (areas less than one square 
mile are subject to measurement for total irrigated acreage on Landsat but 
were considered too small to act as Individual sample units). Stratification 
was done by overlaying clear acetate on 1:1,000,000 Landsat color composite 
transparencies and delineating the appropriate stratum. Multitemporal Landsat 
imagery was used to verify the consistency of the delineation. Since quite 
different agricultural practices and, thet-efore, quite different strata may 
appear similarly on any single date of Imagery, It is very Important to utilize 
the multi temporal capability and synoptic coverage of full frame Landsat to 
obtain an accurate, repeatable stratification. California's virtually cloud 
free summer growing season over the major agricultural areas lends Itself par- 
ticularly well to the availability of a large set of Landsat data for this 
purpose. Figure 4-7 shows an example of the agricultural practice stratifi- 
cation used In the Sacramento Valley. 


4.2.3 Small Grain and Vegetable Stratification 

In order to direct the collection of field data two additional stratifi- 
cations were necessary. Areas of small grain and vegetable cultivation have 
historically posed a problem in ground data collection due to: (1) early harvest 
of grains and subsequent plowdown, and (2) multiple cropping in vegetable areas 
(Section 6.1 describes the dynamics of multiple cropping in the south coastal 
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Stratum Number 


Stratum Description 


1 

2 

3 


4 


5 

6 
7 



Generally dry farmed 

Field crop areas dominated by fields 
less than 16 hectares (40 acres) in size 

Field crop areas dominated by fields 
less than 16 hectares (40 acres) in 
size with known high proportion irrigated 

Field crops dominated by fields 16 
hectares (40 acres) or larger in size 

Orchards and vineyards less than 16 

hectares (40 acres) in size 

Orchards and vineyards 16 hectares 

(40 acres) or larger in size 

Unusual agricultural areas 


Figure V7. Agricultural practice stratification. Stratification similar to this was 
completed for the entire state and was used for the allocation of sample 
units that were ground checked. (Sacramento is located slightly southeast 
of center and marked with an "X"). 
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area of California). To ensure ground data acquisition at the optimum time, 
areas of grain cultivation and vegetable cultivation were stratified separately 
on 1:1,000,000 Landsat transparencies for each hydrologic basin. 

After examining historical date on vegetable cultivation, boundaries 
of historical vegetable cultivation areas published by the California Crop and 
Livestock Reporting Service were transferred to the Landsat imagery. These 
boundaries were refined by reference to the Landsat imagery to account for land 
use changes and urban encroachment. 

Small grain cultivation areas were delineated through analysis of 1976 
through 1978 Landsat imagery. The grain areas were then classified into: 

(1) dryland grain farming; (2) areas of less than 21 percent grain; (3) 21-40 
percent grain; and (4) greater than 40 percent grain. 

Areas where multiple cropping occurs were examined through historical data 
and information from the county farm advisors. (Regionalization, Section 4.2.1) 
Most multiple cropping in California occurs in grain areas, where grain is 
followed by a field crop such as corn or beans, or in vegetable areas, where one 
vegetable crop follows another. The previous delineation of the grain and 
vegetable cultivation areas, therefore, included the majority of the multiple 
crop areas. 


4 .2 .4 Formation of the Merged Stratification 


Merging the agricultural practice, small grain and vegetable stratifications 
as well as locating administrative and exclusion areas was necessary before the 
sample unit list could be generated. Locating administrative boundaries, such 
as counties, exlusion areas (established wildlife refuges, cities), and assigning 
an access cede (Appendix II) is facilitated by reference to available maps. 

Since the agricultural practice and croptype stratifications were based on the 
spatial and spectral information provided by Landsat, it was felt that an 
appropriate base for the merging of these functions was a combination of 
1:250,000 scale USGS topographic maps and 1:250,000 scale Landsat enlargements. 
Enlargements were made on a county basis, by reference to the USGS maps. These 
enlargements and the associated maps provided the base upon which the sample 
frame of 1.6x.8 km (1x5 mile) units was created. The subsequent sample unit lists 
provided the population from which the ground data units were selected. In 
addition to providing the sample frame base, the combination of information 
available from the maps and enlargements was critical for accurate transfer of 
the sample unit boundaries selected for ground checking to the 1 :24, 000-scale 
(7.5* ) USGS maps used by DVIR for field work. 

For each of the 58 counties in California, the land use strata, grain 
cultivation boundaries, and vegetable cultivation boundaries were enlarged from 
the orginal 1:1,000,000 scale to the 1:250,000 scale Landsat prints. Using an 
overhead projector system, each boundary was projected, scale matched to the 
enlargement and drawn on a clear film overlay. By matching topographic features 
on both the original transparencies and the enlarged prints, accurate transfers 
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of boundaries were made. In situations where two or more strata had boundarit. 
that were approximately coincident, they were merged into a single boundary 
(Figure 4-8J.. Color coding allowed for the differentiation of land use, grain 
and vegetable strata. Once the enlargement was complete, the overlays could be 
used on either the Landsat prints or the 1:250,000 scale topographic maps. 

County boundaries were drawn from the 1:250,000 scale maps and overlayed 
on the merged strata boundaries. At this point all image defined agricultural 
phenomena were tied to the county base map. Hydrologic basin boundaries, 
provided by DWR, were transferred onto the overlays for those counties that 
were split into more than one hydrologic basin. Accurate location of this 
boundary was particularly important in those areas where the basin boundary 
crossed agricultural land, since misplacement of the boundary would result in 
farmland being transferred to the wrong basin. 


4.2.5 Generation of Sample Unit List 

When the merging of the strata and the location of county, hydrologic and 
exclusion areas was complete, each county consisted of a set of irregularly 
shaped polygons defined by some combination of the strata. Each polygon was 
labelled indicating the appropriate land use stratum, the presence of vegetables, 
the presence and proportion of grain and the general accessibility of each polygon. 
The merged and annotated overlay was then placed over a gridded template of 
1.6x8 km (1x5 mi) sample units and the 1:250,000 US6S map. In those areas where 
the predominant field pattern was oriented north-south, the sample unit grid 
was placed to coincide with section lines. This was done to increase the ease 
and efficiency of the field data collection effort. In areas where the topography 
or historical land development caused the dominant field pattern to be oriented 
in other directions (i.e. Salinas Valley) the sample unit grid was placed so as 
to conform with the developed road/field pattern system. The sample unit grid 
was traced onto the county boundary overlay for all areas that fell within 
the stratified area. 

Although a sample unit was nominally defined to be 8 km (5 mi) long, actual 
length varied from 1.6 to 11.2 kilometers (1 to 7 mi). Editing of the sample 
units removed those less than 259 hectares (640 acres) in area and those portions 
of units that were less than .4 kilometer (.25 mile) wide. Each sample unit was 
then numbered and placed in a sample unit list. 

The information from the sample unit list was summarized in a table for each 
county. Similar summary tables were made for each hydrologic basin. The basin 
summary sheet was used to calculate average relative access cost within each 
stratum and the proportion of the basin represented by each stratum. This 
information, along with the number of sample units in each stratum (the population 
size) was used to compute the stratum sample sizes as described in Section 
4.1.5. 


4.2.6 Preparation of 7.5 Minute Quads for Ground Measurement 

After the units to be ground surveyed were randomly selected, the boundaries 
of these sample units were visually transferred from the 1:250,000 scale overlay 
to 1:24,000 scale overlays. Standard DWR procedures call for the use of USGS 
7.5' set of maps and contained the selected sample units. Field crews from 
DWR used these overlays for field mapping of irrigated crop land. 
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for the allocation procedure. 








Figure 4-9. Distribution of ground (Phase II) sample units, each of these 637 unit 
was checked by DWR to determine the location of irrigated fields. 




Normally, California's virtually cloud-free summers over major agricultural 
areas provide ample opportunity to select from a large variety of acquisitions 
to obtain the optimal data set. In 1979 the satellite and ground processing 
problems often combined to severely limit or nullify any choice of acquisitions. 
Appendix III lists the acquisitions used for each county. Although certainly 
not the optimal date selection, three time periods of imagery were generally 
available for each of the counties. 


4.3.2 Enlarging and Mosaicing Landsat Frames 

In 1979, as in the earlier projects, measu. ^ment at the Landsat phase 
is done on 1:150,000 scale enlargements of each county. On a county basis, 
each available Landsat frame was evaluated for image quality (i.e. line drop, 
"smearing"), color balance, exposure and miscellaneous items such as cloud and 
smoke. Following this evaluation, the best combination of dates and frames was 
selected for enlargement. 

In order to maximize the efficiency of the darkroom work, each transparency 
selected for enlargement was prepared as follows: 

• 1:1,000,000 scale county boundaries from a USGS 
map were overlayed on the transparencies 

• Templates were prepared that outlined the area that 
would be covered by 8x10, 11x14, 16x20 and 20x24 
inch photographic prints 

• The appropriate combination of templates needed to 
photograph the county was selected, annotated and 
numbered on each transparency 

• Ten mile segments were randomly measured over the 

area of the transparency. A scale (100 lines/centimeter) 
and an enlargement factor were then aligned on the 
border of each template and included when the 
negative was copied. 

These prepared transparencies were sent to the darkroom for enlargement. Scale 
matching was done by use of: (1) the 100 lines/cm scale and enlargement factor 

on the transparency, (2) 1:150,000 scale county boundaries plotted by the Office 
of Geometronics of Caltrans (State of California, Department of Transportation) 
and (3) reference to 1:250,000 scale USGS maps for topographic features and 
location of the county boundaries. 

At this time, sixteen of the fifty-eight counties in California have been 
enlarged. These counties represent approximately 33« of the total land area 
of California but contain 60% of the total possible agricultural sample units. 

When the enlargement was completed, each county was mosaiced together and 
mounted on stiff posterboard. Counties that would have a mosaiced size greater 
than approximately 750cm xl meter were divided and mounted on separate boards. 

This size limitation facilitated handling and interpretation as well as storage. 
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4.3.3 Generation of Recording Forms 


Forms for recording the interpretation done on the multitemporal Landsat 
enlargements were created for each county. To produce the form, the l:15o,000 
scale county boundaries plotted by Caltrans were located on one of the completed 
mosaics for each county. The county boundary was then traced onto a second overlay; 
the originally plotted boundary was archived. The agricultural practice strata, 
exclusions and hydrologic basin boundaries were transferred from the 1:250-000 
scale overlays by interpretation. The agricultural practice strata boundaries 
were necessary because (1) interpretation responsibilities were divided between 
analysts based on these strata boundaries, and (2) digitization of interpretation 
results was needed by stratum. Exclusion areas were also transferred from the 
1:250,000 overlays; reference was also made to 1979, U-2, 1:130,000 scale CIR 
aerial photography to refine boundary placement. The hydrologic basin boundaries 
were needed for summarization of results and as a logical way to divide work 
between the Berkeley and Santa Barbara campuses. Superimposed on the overlay, 
which was now a composite of county, agricultural practice strata, exclusion and 
hydrologic basin boundaries was placed a grid that defined the borders of 7.5 
minute quadrangles. The grid was used as a mechanism for organizing interpretation, 
a unit for documenting the time required to perform interpretation and as a 
potential area for summarization and comparison of results with DWR's land use 
surveys. 


4.3.4 Interpretation of Mul titemporal Landsat Imagery 

The interpretation logic and procedures r‘or identifying irrigated land in 
California are basically the same that have been used in the past projects. 

The analyst is required to make a decision on whether a particular parcel of 
land is irrigated. To do this the analyst relies on a variety of image charac- 
teristics and logical expectations of the presence and appearance of irrigated 
land. 


Providing the analyst with sufficient data to develop reasonable expectations 
is critical to accurate measurement at the Landsat phase. Prior to interpreting 
a particular county the analyst is given a variety of ancillary information upon 
which to build his expectations. These include (1) California Crop-Weather 
which is published on a weekly basis by the California Crop and Livestock Reporting 
Service and sur.marizes weather conditions over the state as well as land 
preparation, planting, growth condition and harvesting of field crops, fruit and 
nut crops, vegetable crops and livestock (pasture and range conditions). The 
information is summarized by region and provides the means for constructing year/ 
gional specific crop calendars; (2) Agricultural Commissioner's crop reports 
for 1979 which list county acreage by crop type; (3) Cal ifornia-Ari zona Farm 
Press that publishes weekly reports on all facets of agriculture in the West 
including land preparation planting, irrigation and water problems, pest and 
disease management, fertilization, plant variety performance, economic marketing 
and tax issues, legislation and harvesting; (4) California Grower and Rancher - 
a monthly published magazine on agriculture in California (written and published 
regionally); (5) 1979, U-2, 1:130,000, color infrared photography of the majority 
of agricultural land in California, and (6) antecedent DWR land use survey quads 
and summary statistics. Using all or a subset of the available data, the 
analyst builds a mental model of what he expects to see on the dates of Landsat 
imagery provided for each county. 
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Image characteristics traditionally used in manual photographic inter- 
pretation are exploited in the analysis of Landsat imagery. For the majority 
of the interpretation, the most critical characteristics are (1) pattern (is 
this an area of agricultural fields?) and (2) color (is this field the color 
expected for an irrigated field on the date being analyzed?). Other critical 
characteristics that analyst relies on are texture, shape of fields, and location 
of fields. These last three characteristics are particularly important when 
interpreting in mountain areas, along rivers and streams, (intermingled 
riparian vegetation) on the fringes of well developed agriculture and in areas 
of dispersed agriculture such as the foothills. 

The interpretation procedure calls for analysis to be done in a specific 
manner. The structure of the interpretation system is designed to (1) 
eliminate variability in the method interpreters use and (2) allow for a 
detailed evaluation of the separate parts of the analysis system. The procedure 
calls for: 


• Within each hydrologic basin assignment of a single 
interpreter is made to each stratum. An interpreter 
may analyze more than one stratum per basin, but 

no stratum should be interpreted by more than one 
analyst. 

• Using the 7.5' grid as a base, interpretation 
proceeds on a quad-by-quad basis moving left to 
right and top to bottom 

• Interpretation is done on the mid-summer date first, 
the spring image second and fall date last. In 
strata where irrigated agriculture dominates, the 
analyst delineates areas that are not showing active 
vegetative growth in July/August. These areas are 
marked with a single dot. The overlay is then placed 
over the May image and the blocks marked with a 
single dot are checked; if these areas are interpreted 
as irrigated cropland in May, a second dot is added. 
The analyst then proceeds to the final date and 
checks the remaining singlely-dotted areas. 

• Within each 7.5' quad the analyst records the time 
required to interpret each stratum on each date 

• Re-check areas as necessary 


4.3.5 Digitization of Measurement Results 


Upon completion of the interpretation, the results must be tabulated for 
input to MPHASE. The first step in this process is to locate the sample units 
that had been selected for ground checking. Accurate location is absolutely 
vital to the estimation procedure since the comparison of ground proportion 
irrigated to Landsat interpreted proportion irrigated "corrects" the estimate 
and provides the data needed to compute accuracy statements. Location is ac- 
complished by reference to the 7.5' quadrangle maps with an overlay of the ground 


annotated sample units. By visual comparison of the ground data (field pattern) 
and map features (i.e. roads, canals, railroads) to the 1:150,000 scale Landsat 
enlargements, accurate location is possible. 

The proportion of irrigated land is then calculated by digitizing the 
total area of each sample unit and the area that is irrigated. Each sample 
unit is digitized and recorded separately. The remainder of the interpretation 
is digitized by stratum within each county. 


4.3.6 Irrigated Cropland Mapping Procedure 

Some water management applications require the use of spatially defined 
data as available only in map formats. As a small part of our effort, we have 
continued to improve and evaluate cropland mapping procedures in cooperation 
with Kern County Water Agency (KCWA). Since 1972, yearly maps of irrigated 
cropland has been supplied to KCWA based on manual interpretation of satellite 
or aircraft acquired imagery. More recently a multistage mapping procedure has 
been implemented to effectively integrate both types of imagery. Figure is 
a map generated from 1978 Landsat imagery and previous data provided by KCWA. 

An update of 1979 using the procedure shown in Figure 4-11 is underway. 

A combined satellite and aircraft aooroach takes advantage of both the 
temporal frequency of Landsat multispectral imagery and the higher spatial 
resolution of aircraft photography to provide a product more useful than is 
available from either source individually. Multidate Landsat imagery is nec- 
essary for accurate mapoing of irrigated cropland because of Kern County's 
long growing season, its numerous crops, and cropping practices (e.g., double 
cropping). While aircraft photography is obtained much less frequently than 
Landsat imagery, its higher soatial resolution is more suitable for detailed 
feature mapping (e.g., field boundaries and homesteads) and identifying speci- 
fic ground conditions; this detailed information has proved to be highly com- 
olementary to temporal Landsat imagery in the classification process. 

A comparison with California Deoartment of Water Resources field based 
maps for approximately 175,000 acres (five 74" USGS quadrangles) has shown the 
multistage approach to be very accurate, with only young permanent crops (e.g., 
orchards and vineyards) causing interpretation problems. Since permanent croos 
are relatively stable, once established, this problem aooears amenable to im- 
provements in the interpretation orocedures; periodic ground surveys or high 
resolution aircraft photography should allow identification and mapoing of 
permanent crops without yearly reinterpretation except to check for removal. 

Since 1976 Kern County Water Agency has funded the yearly cropland updates. 
The one man-month effort required to complete each update provides a cost-effect- 
ive and operational demonstration of remote sensing technology for water manage- 
ment purposes. Based upon the success of this ongoing orogram we olan to inves- 
tigate the potential of a coooerative maoping program involving all of the water 
agencies in the Central Valley. 

Future research activities include an evaluation of Landsat RBV imagery 
as a partial replacement for the aircraft ohotography and a digital implemen- 
tation of the multistage orocedure. It aooears possible to effectively com- 
bine both maoping and sampling procedures into a system that can orovide 
spatially defined products with quantitative confidence statements regarding 
total irrigated acreages. 
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CROPLAND MAPPING PROCEDURE 
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4.4 GROUND MEASUREMENT 


For each of the 637 Phase II sample units, DWR district personnel made a 
field-by-field inspection to determine the presence of irrigation. Using 7.5' 

USGS quads with the plotted sample unit outlines as a base, (Section 4.2.6) 
field boundaries were drawn and each field coded. In many cases, detailed ground 
data including specific crop type mapping was done. At a minimum, the ground 
crews mapped parcels as irrigated or non-irrigated grain, safflower, field 
crop, pasture, other agricultural classes or lawn areas; fallow, farmsteads, 
feedlots, or dairies; native vegetation, water surfaces or unsegregated native 
vegetation; and six classes of urban. More than one visit was made to many of 
the units to verify multiple cropping. 

When collecting the field data DWR often used their previously mapped 
land use survey quads (Figures 1-1 and 1-2) and the 35mm color aerial slides 
from which the maps were derived as aids for defining field boundaries. Color 
infrared 1:130,000 scale aerial photography flown by the U-2 during the spring 
and early summer of 1979 was also used extensively as soon as it was available. 

For a few units where access was particularly difficult, low altitude aerial 
observation of the unit provided the necessary information. 

Each of these sample units was then tabulated by DWR and acreages output 
in a variety of forms: (1) by hydrologic basin - individual sample units 

listed by county (Figure 4-12); (2) by 7.5' quadrangles - sample unit(s) and 
county; (3) by county-cumulative summary of all sample units within the county; 
and (4) by DWR district-cumulative summary of all sample units mapped by the 
individual district offices. In total, DWR personnel ground checked (at least 
once) and tabulated approximately 520,400 hectares (1,286,000 A) across the state. 

4.5 ESTIMATE SUMMARY, EVALUATION AND REPORT 

The final estimate will be completed in mid-1980. At this time enlargement 
of Landsat, interpretation and tabulation are in full production. Following 
the calculation of the results, a detailed evaluation of the individual inventory 
system components and the overall system performance will be done. Working 
closely with DWR, the following system components will be evaluated: 

• Sampling , *ame 

• Stratification 

• Sample unit size 

• Sample unit orientation 

• Sample unit frame construction 

• Sample Allocation 

• Revised allocation using statistics available 
from 1979 data set 

• Comparison of revised allocation and expected variance 
to allocation used and variance achieved 
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• Comparison of sample variance from proportional 
versus optimal allocation of SUs for a set of 
fixed sample sizes 

• Determination of sample allocation and resulting 
sampling variance from probability proportional 
to estimated size (PPES) SU selection 

• Determination of possible cost savings using 
systematic selection of area SUs for given 
sample variance goals 

• Measurement procedure 

• Landsat image interpretation 

• Ground data collection 

• Digitizing method 

• Irrigated land area estimation procedure 

• EquatioriS used to link sample phases to produce 
area estimates 

• Equations used to predict errors associated with 
area estimates 

• Procedures used to aggregate stratum estimates 
into final estimates 

An evaluation of the overall system performance will also be done. This will 
include: 


• Determination of estimate error by reporting unit 
relative to DWR baseline 

' Determination of the sensitivity of the final error 
estimates to individual inventory components 

• On the basis of the inventory results, development 
of expected error versus cost curves for given levels 
of statistical confidence 

• Summarization of expected through-put rates 

Following the detailed procedure outlined above, the results of the evaluation 
will be reviewed in concert with DWR and NASA to determine whether the inventory 
system demonstrated during 1979 met DWR's performance requirements at that time. 
Recommendation will be given as to what changes, if any, should be made before 
a future operational implementation of the inventory system. 
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5.0 ESTIMATION/MAPPING OF IRRIGATED LAND USING DIGITAL ANALYSIS TECHNIQUES (TASK II) 


The digital analysis of Landsat multitemporal data for inventorying ir- 
rigated land received increasing emphasis in 1979. The information requirements 
for Task II were essentially the same as for the manual analysis of irrigated 
land. That is, the type of information needed is the estimation and/or 
mapping of irrigated land; the area of summary will eventually be the same as 
for Task I; and the performance criteria for the estimation procedure used + 5S 
at the 9S% level of confidence as a baseline. 

Although representing only 20% of the total effort this year, significant 
progress was made on several sub-tasks. The sub-tasks were designed to address 
the major goals of 1979: (1) analyze and evaluate methodologies for the 

registration of multitemporal Landsat digital data and (2) test and evaluate 
various classification procedures. Three test sites were used: A 1® block 

in the Sacramento Valley studied by UCB for registration and classification 
procedures; and at UCSB, a 1® block and three 7.5' quadrangles testing registration 
and the same three 7.5' quadrangles evaluating classification methodologies 
(Figure 3-1 ) . 


5.1 REGISTRATION OF MULTITEMPORAL LANDSAT DIGITAL DATA 

Because the precise registration of multitemporal Landsat digital data is 
imperative for accurate classification, significant effort was put on the 
exloration and testing of two major registration procedures: 

• Control point least-squares analysis, and 

• Cross correlation 


5.1.1 Control Point Least-Squares Analysis - Remots Sen s ing Research Program 
( RSRP), UC Berkeley 

In evaluating the candidate registration system, a number of questions 
were addressed: 


• Could this system be efficiently used on a 
mini computer? 

• What, if any, problems would be encountered when 
"sewing" adjacent Landsat paths together? 

• How many control points are required to satisfactorily 
register and rotate to north mul ti temporal Landsat 
scenes? 

• How could files be created that are based on USGS 
7.5' quadrangles (DWR's stanH^id map base)? 

The test site selected for analyses was in the Sacramento Valley and consisted 
of a 1° block divided into four 30' segments. Each 30' block was a set of 
sixteen 7.5' quadrangles (Figures 5-1 and 3-1). The 30' block size was selected 
because: (1) it was convenient for storing and manipulating Landsat in a 

variety of forms for real time interaction, (2) coordinate transformations 
performed on an area this size were expected to maintain acceptable mul ti temporal 
registration accurate at the 15' and 7.5' quad size and (3) the 30' block is a 
multiple of the 7.5' quad which is DWR's standard reporting unit and which is 
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always located completely on one Landsat scene. 

The tnulti temporal Landsat data used as a test set for registration was 
also to be used for classification of irrigated land in the second sub-task. 
Date selection was controlled by a number of factors: (1) a three-date 

system, similar to that used in Task I, was to be tested; (2) the tes 
selected should, therefore, mimic the time windows used in a Task I system, 
and (3) 100* ground data for the test site was available for the 1976 growing 
season. The dates selected for registration and classification were: 


Scene 47-33 

30 May 76 
28 August 76 
3 October 76 


Scene 48-32 

22 May 76 
20 August 76 
4 October 76 


The first step in the multitemporal registration of the Landsat data was 
to create, for each date, a data file on the disk containing an entire 30 
minute block. Initially, the raw data was displayed from tape and point and 
line coordinates for the area containing the block were determined. This 
area, somewhat larger than the block itself in order to accomodate a north- 
south rotation, was then placed on the RSRP data disk. 8ecause the block area 
usually required more than one quad of the Landsat scene, it was necessary 
to (1) create a file from each Landsat quad and (2) merge those files to 
create a new disk file containing the 30 minute area. To merge the Landsat 
files, a dummy file was created of the appropriate size with all values set 
at zero. The Landsat files were then transferred to the proper coordinates 
in the dummy file and the area was "sewn" together. This creating and merging 
was done for each of the three dates, late May, late August, and early October. 
An MSS 7/5 ratio band was then created for each date by multiplying the value 
in MSS 7 by 2 and dividing that product by the value in MSS 5 for each pixel. 

A set of control points was selected to initiate registration of the 
multi temporal data set. One set of control points was used for the three dates. 
These points were distributed as evenly as possible over the 30' block with 
approximately three points per 7.5' quad area. Control point coordinates were 
obtained by displaying the disk file for each date, moving the cursor on the TV 
monitor to the selected point, and recording the x and y coordinates. Control 
points were selected based on (1) the case with which they could be located on 
the three dates of Landsat and the DWR ground data maps, and on (2) an 
approximately even distribution of points over the 30' block. The x and y 
coordinates of each control point were then measured on 7.5' quads in 1/60 
inch increments. Measurements were made using the upper left corner of each 
7.5' quadrangle as the origin. 

Dimensions for the 30' north-south ground coordinate computer file were 
set at 620 points by 800 lines. These dimensions were chosen to (1) allow 
full display of two 7.5' quadrangles at a time on the TV monitor (each of 
dimension 155 points by 200 lines), and to (2; give a map cell size of 
approximately 0.5 hectare (1.2A). These dimensions were then used to convert 
the mao coordinates (in inches) to ground coordinates using the following 
formulas: 
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( 18 ) 


^ + (N X 155) 

nr y 

^if ' II + (M X 200) (19) 

where 

’ X value 1n new file 

X„ » X value on the map 
in 

* Y value in new file 

Y„ » Y value on the map 
m 

W » 7.5' map width in inches 
L * 7.5' map length in inches 

N * 0,1,2, or 3 depending on whether the 7.5' map is the first, second, 
third or fourth from the west side of the 30' b^ock 


M = 0,1, 2, 3 depending on whether the 7.5' map is the first, second, 
third or fourth map from the north side of the 30' block 


The control point coordinates for the three Landsat dates and the new 
ground file were run through the regression program DANIEL. This program 
calculated the equations necessary to transform the Landsat data to the new 
ground coordinate file. These equations were of the form: 


''Landsat 


= b„ + b Xp + b x! + b Y- + b Y^ + b X-Y- 

0 I b 


( 20 ) 


and 


^Landsat » b + b Xp + b Xp + b X- 

6 7 u 8 « 9 « 


+ b 



+ b 

1 1 




( 21 ) 


where Xg and Yg on the right side of the equation are new ground file coordinates. 

The equations from DANIEL were used in the program COTRANS to place the 
Landsat data into the new file. This program resampled the data by using the 
DANIEL equations and the coordinates for each new file cell to predict the cor- 
responding location in the original Landsat file. The data values for that 
pixel were then transferred to the cell in the new file. This was done for the 
7/5 ratio bands for each date, the end product being a file 620 points by 800 
lines with three bands: May, August, and October (when a 30' block was not 

covered by a single Landsat frame, the appropriate portions of each frame were 
resampled and then sewn together.) The data had been rotated, so that the new 
file corresponded to the mao, with a rvTrth-south jrientation, and transformed so 
that a particular cell represented the same point on the ground for all three 
dates. 
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Least Squares Analysis of Landsat to Map Registration 

Registration of multiple Landsat dates to a north-south ground coordinate 
system was accomplished using control points as described above. A major 
question associated with this technique was how many control points were required 
to give a satisfactory registration. Evaluation of this problem based on 
repeated registration of the same area using differing numbers of control points 
would have required an impractical amount of analyst and comouter time. As an 
alternative, regression equations of the same form as equations 20 and 21 were 
developed for differing numbers of control points. These equations wore then 
used directly to estimate average registration accuracy for the sot of all pixels 
in the map product. This analysis was applied on two block sizes, 30' and 1®. 


The 30' Sutter Block (Figure 5-2c) was selected as the initial test area. 

This region represented a typical Central Valley agricultural/nonagricul tural 
land use mix. Seventy-seven control points were selected over the Sutter Slock 
in as uniform a pattern as possible, averaging approximately four to five 
control points per 7-1/2 quadrangle. After recording and verifying the (X,Y) 
ground and corresponding Landsat coordinates, regression equations of the form 
given previously were fit to the 77 control points for each of the three 1976 
Landsat dates (May, August, and October). These equations were then used to 
generate an 11 x 11 matrix of expected (X,Y) Landsat coordinate pairs systematically 
covering the 30' block. 

Next, the number of control points was reduced to 61, then to 46, 33, 

16 and finally 8 by culling points systematically from the original 77. In 
each case, culling was performed by removing one control point from each 7-1/2 
quadrangle.* ^Landsat ^Landsat equations were fit to each set 

of points (66,46,33,16,8) for each of the three Landsat dates. The resulting 
equations were used to predict Landsat coordinate pairs for the same 11 x 11 
matrix of systematically located map reference points used previously. 

Registration error introduced by reducing the number of conti^ol points below 

77 was computed for each coordinate pair in the 11 x 11 matrix by subtracting 

the predicted value (based on k=66, 46, 33, 16, or 8 control points) for X or 

Y from its expected value (k„.^» 77). That is 

rndx 

^^j(k) * ^j(77) ■ ^^ij(k) 

^^j(k) ' ^j(77) ■ ^j(k) 

where i and j represent the row and column indices in the 11 x 11 matrix of 
points systematically covering the area to be registered. 


*A point was removeo unless removing that ooint would leave no control ooint in 
that oarticular 7-1/2 minute Quadrangle. 

For the 3 point case, one point was taken from every other 7-1/2 quadrangle 
in a checker-board fashion. 
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Chico 30' block (Scene 48-32) Chico 30' block (Scene 47-33) 

Number of control points for the 7.S* quadrangles within each 30 block. 












The average registration error per pixel, sign ignored, introduced by 
k < 77 control points was then estimated by averaging the squared deviations 
and taking the square root, viz 




(24) 

(25) 


Hv and d^ were defined as the relative registration error for control point 
\ -k 

density k in the X and Y dimensions. That is, this component of error was 

defined in terms of differences from (relative to) Landsat coordinate positions 

predicted with k =77 control points, 
max 

It should be noted that straight avera^-.s of ^^ij(k) 

to give smaller values than or due to cancellation of differences over 

^k \ 

the population of pixels to be registered. However, such straight averages are 
misleading. Classification error depends (in part) on the absolute misregistra- 
tion for any pixel in the scene, not on an average, sign-considered registration 
error computed over all pixels. 

The average total absolute error per pixel was defined as the Euclidean 
sum of the relative error (3”^^ or dy) and a term representing the error associated 

w'ch the regression model used to predict Landsat coordinate positions with 77 
control points. This second error component was taken to be equal to the mean 
squared error (MSE) computed for these regression equations. Thus, assuming 3” 
and MSE are independent, the average total error per pixel was computed as 


Dv = ^ 


(^y )2 + MSE, 


'77 


(26) 



’'(dy )2 + MSEy 

’k V7 


(27) 


80 


tbr the X and Y dimensions, respectively.* In words, 


9 


D = Ave Absolute 
Error Per Pixel 




/error (bias) : 


f average of the square of the\ 

introduced with' 


deviations between pre- 

k points instead 

1 + 

dicted coordinates and actual 

\of k_.. 1 


[coordinates based on k 

max ' 

\control points / 


Uy and Uy were computed for k=77, 66, 46, 33, 16, and 8, in the 30' Sutter 

block. The results for X are plotted in Figure 5-3a and for Y in Figure 5-3b. 
In addition. 


D 


k 



+ 



(28) 


the Euclidean average of the two errors (expressed in units of vertical pixels) 
was calculated and plotted in Figure 5-3c. Inspection of these figures indicated 
that registt^ation error generally began to increase significantly below approximately 
40 control points. Adding five control points to this number to account for 
culling of control points giving significant regression outliers, lead to a recom- 
mendation that dt least 45 uniformly distributed control points be obtained for 
registration on a 30' block basis . 

Three more 30' blocks were processed using a 45 control point objective. 

These were the Maxwell block (40 points after culling), the Corning block (37 
points), and the Chico block (23 points from scene 48-32 and 32 points from 
scene 47-33) as shown in Figure 5-2a-e. 

Visual inspection of the Sutter block registered with 77 control points 
showed that date-to-date registration error did not exceed one pixel. When 
this error occurred it was typically located along field boundaries originally 
flush with Landsat line-column geometry that had become diagonal in the new 
North-South coordinate system. The same situation obtained in the other three 
30' blocks registered using the 45 point rule. As expected, wildland areas 
having very few control points tended to have larger registration errors. 


*A third component of error exists, 
coordinate values away from control 
77 points. 


max 


This is the error introduced by predicting 
points with the regressions based on 
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Visual inspection of the four 30* blocks also indicated tha^t Landsat-to- 
ground registration error was within error bounds predicted by 0. . Field and 
road geometry in the North-South coordinate system appeared excellent. Seams 
between irrigated-nonirrigated class maps for the four 30' blocks were difficult 
if not impossible to detect, further confirming proper ground registration. 

A further study was performed to determine if a less dense (and there- 
fore less costly) network of control points would be required if registration 
was performed on an area larger than a 30* block. Time and resources permitted 
an examination of this problem only on the 1° block area (shown in Figure 5-1) 
used for the four 30' blocks registered earlier. Since this area was partially 
covered by two Landsat frames, and since separate registration regression 
equations were required for each frame, analysis of the large area registration 
problem was limited to the southeastern two thirds of the 1" block covered by 
scene 47-33. 

The 7.5 minute quadrangles used in this registration problem are shown in 
Figure 5-4. Numbers inside each rectangle represent the number of control 
points available for registration within each quadrangle. Control points 
obtained in the previous work with 30' blocks were used to provide the best 
possible comparison of registration error. 

A series of (x,y) regression equations of the form specified earlier were 
then computed using successively larger sets of control points. Thus the 
first pair of regression equations (predicting and were 

based on one control point selected at random from every other quadrangle in 
a checker board fashion. The next pair of regressions were based on one control 
point from each quadrangle, the next on two points from each quadrangle, and 
so on. 

Using the pair of regression equations based ^n the maximum number of 
control points (k_. = 149) as a reference, , D„ , and D. were computed 

maX X|^ y|^ K 

for all other equations based on k< k„, number of control points. These 
^ max 

results are plotted in Figure 5-5a - 5-5c. 


Based on the results of this latter study, it appears that registration 
satisfactory for producing irrigated-nonirrigated class maps can be obtained 
on a one half 1’’ block basis using 45-60 control points. The behavior of the 
regression relationships exlored here also strongly suggest a similar number 
for registration on a 1® block basis. As in the case o'*^ the 30' blocks, 
these control points should be spread in as uniform a manner as possible 
over the block and in the area surrounding the block. 

An accounting caution on areas as large as 1° is advised. Suppose the 
resulting digital image or class map for the 1° block is represented by a 
rectangle having x columns and y rows, each (x,y) map cell of equal size. 
Then, in California's latitudinal range, a map cell in the top row will 
represent an actual ground area approximately one percent smaller than a map 
cell in the bottom row of the block. This effect is due to the convergence 
of longitudinal lines at the North Pole. Consequently, in producing area 
estimates, a correction (scaling factor) must be introduced by row (or by 
group of rows) to standardize the ground area represented by each map cell. 
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5.1.2 Cross Correlation Registration Procedure-Geography Remote Sensing Unit 
(GRSU) U.C. Santa Barbara 

The digital analysis of Landsat CCT's requires the ability to accurately 
register multidate imagery, particularly for identifying specific crops. As 
part of this year's effort, two registration procedures were explored — a 
manual procedure where tiepoints between two dates are selected by an analyst 
using a television monitor display and an automatic procedure using cross corre- 
lation and regression analysis. 

Manual Approach 

The manual procedure was undertaken on a portion of the statewide Landsat 
mosaic generated by the Jet Propulsion Laboratory (JPL) for the California 
Department of Forestry*. Using the 1° Sacramento West quad (August 1976) from 
JPL as a base, two other dates (30 May 1976 and 03 October 1976) were registered. 
Visual analysis of the three dates yielded 50 points that were observable on each 
date. The features were selected in a fairly systematic network over the entire- 
scene. 

Subimages from the base date, around each of the tiepoints, were then 
displayed on a video monitor and line and sample coordinates noted with a move- 
able cursor. The same subscenes from one of the other dates was displayed on 
a second monitor and the line and sample coordinates of the tiepoints noted. 

This procedure was repeated for the final date. When this had been completed 
for all 50 tiepoints, the appropriate line and sample coordinate values were 
used as input to GEOMA, a VICAR program designed to register digital images. 

As a check, the 50 subimages for each date were overlayed and disolayec 
on the video monitor after registration to determine the performance of the 
GEOMA procedure. Each date was displayed in a different color and alternately 
turned on and off. Those that were offset from the base date were re-evaluated 
to determine the correct tiepoint coordinates and the results placed in GEOMA 
for a second time. When the analyst had evaluated all 50 points and was satis- 
fied with the fit of each, the three dates were considered registered and 
available for further analysis. While a detailed examination -^or goodness o^ 
fit was not conducted, the May and October image each seemed to be within 1 or ? 
pixels of the base over the entire image. A greater number of tiepoints could 
improve this fit. 


'“ross Correlation Aporoacn 


Two tests were conducted using a VICAR cross-correlation procedure, FICRECjB 
the first for a small area in Kern County and the second for a large area near 
Sacramento. The premise behind this aonroach is that stable features common 
to each date will drive the selection of accurate tieooints. The presence of 


*Managed through NASA-Ames Research Center as oart of the California Intearated 
Remote Sensing System (CIRSS) Aoplications System Veri ^-cation and Transfer 
Rronram. 


S6 


noise in the images should not severely impact the procedure so long as the 
noise is random. Any adverse impact of noise can be minimized through image 
enhancement of stable features. 

Such a procedure has been developed in LACIE for registration of multi- 
date agricultural scenes.' In most cases, the biomass content of fields changes 
dramatically from season to season. The major stable features are field 
boundaries, which can be accentuated by a digital high pass filter. In the 
LACIE approach, a high pass filter was applied to bands 5 and 7, and a binary 
image was created for each band with the lowest 85* of the pixels being set to 
0 and the upper 15* being set to 255. The assumption here is that approximately 
15* of the scene is made up of boundaries and edges. The edge images for bands 
5 and 7 were added together to yield an image which showed the location of 
boundaries present on either or both bands. By using both bands, a greater 
population of edges was available for analysis. Composite edge images for 
multidates can then be registered using the cross correlation procedure. 

Using a three - 7S minute quadrangle study area in Kern County, a test 
of the cross correlation procedure was undertaken. Three dates of imagery 
(June, July, and October) were used. All three data sets had been previously 
registered, using visual techniques, to within *5 pixel accuracy so the test 
consisted of judging the ability of the cross correlation procedure to cor- 
rectly select the corresponding tiepoint coordinates. Assuming a maximum 
misregistration of 0.5 pixels, the average expected misregistration in eithe*" 
the X or Y direction is 0.25 pixels, or up to 0.35 pixels in a diagonal direction. 
The 30 tiepoints were selected from over the scene in a systematic fashion. 

Each date was enhanced as described in the LACIE procedure with the exceotion 
that the high pass filter used enhanced only vertical features. The result 
was that north-south trending boundaries were emphasized while the east-west 
trending lines were not. While not specifically evaluated, the assumption 
that 15* of the scene consisted of edges was accepted. 

The cross correlation was carried out on a window of 32X32 oixels on 
a base image window of 64X64 pixels around each of the „0 tiepoints. Those 
points for which a poor correlation was found were removed from the analysis. 

The remaining points, consisting of line and sample coordinates for the oase 
date and the coordinates (output from PICREGB) for the date to be registered, 
were analyzed by a simole regression program that determines how well one 
group of points predicts the other. A residual of approximately 2 pixels was 
accented as the upper threshold criterion for editing out bad tiepoints. After 
editing out those Points with too large a residual value, the remaining points 
and their values from PICREGB were analyzed to determine the average Euclidean 
distance between the base date and the date to be registered. The results are 
shown in Table 5-1. For reasons not yet clear, the performance of the band 
5+7 comoosite for October and July was not cs good as that for band 5 alone. 


Grebowsky, G. J. 1973. LACIE Registration Processino: A Technical Descrip- 

tion of the Large Area Crop Inventory Experiment (LACIE). LACIE Symposium 
Proceedings of Technical Sessions Volume 1. NASA Johnson Space Center. 
October 23-26, 197S; 37-97. 


Table 5-1 


Average Euclidean Distance (Pixels) Between 


Tiepoints Selected By Cross Correlation 


(Kern 

County 3-Cuad Study Area) 

Average 

Date 1 

Da_eJ. 

Band(s) 

Distance 

June 

October 

5 

2.37 

June 

October 

5+7 

0.93 

June 

July 

5 

0.91 

June 

July 

5+7 

0.77 

October 

July 

5 

0.60 

October 

July 

5+7 

0.65 

June + October 

July 

5+7 

0.48 
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The test was continued to determine the value of a multidate composite 
base image to which additional dates could be registered. By increasing the 
number of edges for registration, the performance of the procedure should 
improve. Table 5-1 shows the average Euclidean distance between a July (band 
5 and band 7) edge image and a June and October (band 5 and band 7) composite 
edge image. As can be seen the apparent performance of the cross correlation 
procedure is improved due to a greater population of edges in the base image. 
Since the Boolean addition of two dates together is a relatively inexpensive 
procedure, a fine tuning of the automatic registration procedure can be done 
without incurring significant additional costs. 

The second test of the cross correlation was conducted on a portion of 
the Sacramento 1° guadrangle generated by JPL. Using the JPL August 1976 
scene as a base, a May and October date were registered. This test differed 
from the first in that it was over a larger area (1500 X 1000 pixels), the 
proportion of edges in the scenes was not set at an arbitrary value but was 
instead determined by visual examination of the high oass-filtered images and 
the procedure involved the registration of previously unreal stored imanes. 

Four tiepoints were selected, in general proximity to the corners of 
the images, in order to perform a rough registration. This was necessary 
to bring all three images into the same coordinate scheme. As in the Kern 
County test, highpass filter images were created for bands 5 and 7 for each 
date. A cutoff value was visually determined and the data was given a 
binary stretch. A composite image was formed for each date by boolean addi- 
tion of bands 5 and 7 (Figure 5-6a-c). 

Using a systematic grid of 126 tiepoints, spaced 100 pixels apart in the 
line and sample direction, the M, y and October images were analyzed to find 
those tiepoints that could be matched with the August base. PICRERB, a VICAR 
program, searches the area around the input tieooint locations to find the 
best pixel-to-pixel match. If an adeguate correlation is found for a parti- 
cular tiepoint grid location, that tiepoint is removed during subsequent 
editing. Afterwards, those tiepoints with good correlation were analyzed by 
regression to remove any tiepoints whose spatial location was inconsistent 
with the overall pattern of tieooints. The criterion developed was based on 
visual examination of the histogram of tiepoint residuals (observed-comouted). 
In most cases, the peak of the histogram centered between -0.5 and 0.5. A 
cutoff was used when the number of tiepoints for a oarticular residual value 
fell below three and did not rise at least to a value of three within one 
residua! unit (see Figut'e 5-7). 

The remaining tiepoints were then used to register the May and October 
dates to the JPL base. Of the 126 tieooints in the original systematic grid, 
there were 41 and 52 tiepoints retained for registration for May and October, 
respectively. Figure 5-8 shows a multidate color composite made from band 5 of 
each date. 
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Figure 5-6 (cont'd) 
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Figure 5-3 



To test the qoodness-of-fi t between the registered imaoes, each of the 
images (1500 X 1000 pixels) was divided into 24 subscenes (250 X 250 pixels). 

For each subscene, a feature that was identifiable on each date was located 
on the television monitor. Generally, this feature was a road intersection. 
Using a cursor, the line and sample coordinates of the feature were obtained 
on each date Tables 5-2 and 5-3 show the coord’nate values and the Euclidean 
distance between the May and August dates and the October and Auoust dates. The 
average misregistration for May'and October was 1.36 and 1.56, respectively. 

As can be seen, the majority of the points were within one oixel registration. 
The large misregistrations occurred in those areas near the edne, bevond the 
main body of tiepoints. This was seen particularly on the eastern boundarv of 
the images where the cross correlation procedure, optimized for stable edoes, 
did not yield tiepoints in the native vegetation of the Sierra foothills. For 
reasons not entirely clear, the procedure vielded few ticnoints on the northern 
boundary, althouch this area was covered by agricultural fields. 

The computifig cost for the procedure was annroximatoly $100.00 (or $30.00 
at the current overnight rate). This can be broken down for a 1500 X 1000 
image as seen it. Table 5-4. 

..nprovements in the procedure are possible. A greater saturation of 
tiepoints in the input grid would result in a greater number of tiepoints 
being retained. This can be don_ for the image as a whole or for selected 
problem areas (natural environments, imaoe edog,). A second improvement would 
involv localizing the cutoff procedure on tlic highoass filter imaoe. for 
this work a global cutoff was used, but a moving block of 100 X 100 pixels, 
for example, would allow for local dc-finitioe of edges, fresently, optimi- 
zation of edge definition for one environment mav be done af f'ne expense of 
another . 
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A third possibility that remains to be examined is the use of non-binary 
input to PICREGB. While the current procedure used the binary approach proposed 
by LACIE, it may be that optimal edge definition lies somewhere between the 
binary and continuous extremes. 

Improvements in the edge inage inputs to PICREGB appear possible if a 
texture is used in place of one subjected to a high pass filter. Figures 5-9a 
and b show a texture and high pass image, respectively, for a 3-quadrangle area 
in Kern County. While a comparison between the two is limited since the high 
pass filter was operating in one dimension and the texture procedure is two- 
dimensional , the texture image hes much less "sparkle" and its edges are more 
easily separable (in the spectral domain) from the non-edge background. The 
texture image is created by computing the standard deviation for a 3 x 3 kernel. 

The present regression and residual analysis utilizes a global regression 
to detect tiepoints significantly different from the general pattern. A local- 
ized regression may result in a better measure of residuals. 

Finally, upon completion of this effort, it was discovered that TIECONP, 
a VICAR program that organizes tne input tiepoints into vertices of triangular 
areas for localized "rubber-sheeting", broke down in those areas beyond the 
main body of tiepoint so that extrapolation of the computed fit was generally 
not valid toward the edges. A new version of this program, which does not 
have this extrapolation problem, has been received from JPL but was not used 
to re-run this analysis. The average pythagorean distance for those test points 
that were not beyond the perimeter of the main body of tiepoints was computed. 

The average misregistration from the August base was only 0.40 pixels and 0.54 
pixels for May and October, respectively. 

We feel that the procedure shows great promise as a cost effective technique 
for automated registration. The new EROS CCT registered format may preclude 
the need for an initial rough registration. A registration package could be 
developed that essentially automates the entire procedure. 


5.2 CLASSIFICATION OF MULT I TEMPORAL LANDSAT DIGITAL DATA 

The second major sub-task of Task II evaluated various classification tech- 
niques for estimating and mapping irrigated land within California. Three class- 
ification methods were analyzed this year to provide test results needed to 
recommend a system for a large scale demonstration (i.e., two hydrologic basins) 
in 1980. The algorithms evaluated were: (1) MSS band 7/MSS band 5 simple linear 
discriminant, (2) Kauth Thomas greeness transform, and (3) cluster labeling. 

UCSB studied all three techniques on a three 7.5' quadrangle test site in Kern 
County (souther' "an Joaquin Valley); UCB tested the 7/5 discriminant using the 
1° block test Site in the Sacramento Valley. 
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Table 5-2 


Cross Correlation of May to August Base 
Registration Accuracy 
(Sacramento Valley Study Area) 


JPL 

Base 

May 


‘ Eucl idean 

Line 

Sample 

Line 

Sample 

Distance 

194 

98 

192 

92 

6.3 

159 

360 

158 

359 

1.4 

103 

724 

103 

724 

0 

232 

864 

235 

865 

3.2 

317 

124 

316 

124 

1 

356 

398 

356 

398 

0 

310 

699 

310 

699 

0 

459 

947 

463 

849 

4.1 

733 

141 

734 

141 

1 

700 

477 

701 

477 

1 

631 

610 

631 

610 

0 

732 

903 

733 

905 

2.2 

933 

130 

933 

130 

0 

973 

405 

973 

406 

1 

847 

656 

847 

656 

0 

902 

930 

903 

930 

1 

1203 

187 

1201 

188 

2.2 

1038 

312 

1038 

312 

0 

1100 

694 

1100 

693 

1 

1128 

839 

1127 

839 

1 

1330 

182 

1329 

184 

2.2 

1381 

357 

1381 

358 

1 

1432 

611 

1432 

610 

1 

12% 

834 

1294 

834 

2 
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Table 5-3 




Cross Correlation of October to August Base 
Registration Accuracy 
(Sacramento Valley Studv Area) 


JPL 

Base 

October 

Eucl idean 

Line 

Sample 

Line 

Sample 

Distance 

194 

98 

194 

95 

3 

159 

360 

162 

361 

3.2 

103 

724 

105 

714 

10.2 

232 

864 

235 

864 

3 

317 

124 

316 

123 

1.4 

356 

398 

355 

398 

1 

310 

699 

309 

700 

1.4 

459 

947 

463 

949 

4.5 

733 

141 

733 

141 

0 

700 

477 

701 

477 

1 

631 

610 

631 

611 

1 

732 

903 

732 

903 

0 

933 

130 

934 

130 

1 

973 

405 

973 

406 

1 

847 

656 

847 

656 

0 

902 

930 

902 

929 

1 

1203 

187 

1203 

187 

0 

1038 

312 

1038 

312 

0 

1100 

694 

1100 

694 

0 

1128 

839 

1128 

839 

0 

1330 

182 

1329 

183 

1.4 

1381 

357 

1380 

357 

1 

1432 

611 

1431 

610 

1.4 

1296 

834 

1295 

834 

1 
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Table 5-4 



206 29 8.70 





5.2.1 San Joaquin Valley Test Site - UCSB 


Classification Using 7/5 Cutoff Approach 

Our experience to date on Task I indicates that the Landsat identification 
of irrigated land in California is a relatively simple task because of the 
bright red color infrared appearance of crops against generally non-vegetated 
backgrounds. Task II calls for the digital implementation of this procedure. 
While standard classification procedures often make use of all four of Landsat's 
spectral bands, the cost of such an approach is significant when numerous dates 
are used and an area as large as California is involved. The use of simple 
discriminants of redness that do not depend on all four channels of spectral 
information .such as the 7/5 ratio, are an attractive means to significantly 
reduce the amount of data to be processed. 

A test of a simple discriminant of "redness" (7/5 ratio) was undertaken to 
evaluate the ability of the discriminant to identify cropland on multidate 
imagery and to test the use of a Boolean scheme to summarize multi date results. 
The data selected for evaluation was a three 7'-j' quadrangle study area in Kern 
County. Three registered and rectified dates (6 June 1976, 21 July 1976, and 
10 October 1976) were used. In addition, a crop map based on field data was 
available for the study site. 

The 7/5 ratio is a particularly effective discriminant for irrigated 
cropland. Band 5 returns a comparatively low brightness signal for vegetation 
while Band 7 returns a comparatively higher brightness signal. The result of 
the ratio is a value that is considerably higher for healthy vegetation than 
for other classes. Figure 5-10 is a plot of Bands 5 versus 7 for 21 July, 1976. 
The distribution is very similar to that resulting from plotting Kauth bright- 
ness and greenness channels, shown in Figure 5-11. The relationship between the 
two vegetation indices is verified in Figure 5-12, which plots greeness versus 
ratio values and has a correlation of 0.91 (the plot shown is that of 5/7 
ratio's, the inverse of 7/5 and thus inversely related to greenness). 

It is important to note that effective ratio cutoff values vary from 
date-to-date. In the spring, when there are numerous native grasses, the 
choice of the cutoff value must be conservative to avoid confusion. In the 
late summer or early fall, when most of the background vegetation is senescent, 
the selection of the minimally acceptable level of redness can be more liberal. 

Using the selected ratio cutoff point for each date, classified images 
were created, with a value of 1 given to each pixel of irrigated vegetation 
and 3 given to all others. In this fashion, images containing only vegetated 
cropland were created. Figure 5-13 shows such an image for July. The three 
registered dates were added together to result in a new image with four 
oossible pixel values (0, 1, 2, 3) representing the number of dates on which 
healthy vegetation was found. When the three dates were added together, 
those areas for which irrigated vegetation was c 3sent on at least one date 
(sum -»0) were flagged. When reduced to the binary case, this closely mimicks 
the decision process and type of final product from Task I. 

It should be noted that each additional date added new information. The 
omission of any one date would have resulted in a smaller measurement of the 
amount of irrigated land. Because the earliest date used here was 6 June 
1976, it is highly probable that a spring date would have increased the 
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Figure 5-12 



Figure 5-13 




amount even more, because small grains were already senescent by June and were 
missed in this test of the 7/5 ratio. 

Figure 5-14 shows those pixels that represent cropland irrigated on 0, 1, 2, 
or 3 dates. This information represents another level of sophistication in 
that the number of times a field was irrigated, and not merely the fact that 
it was irrigated at least once, is more valuable information for water demand 
determination. While not presented here, the technique could be carried a step 
further with classification using all oossible permutations of dates. This 
would define a temporal signal of land use that possibly could be correlated 
with specific crop types much like the use of croo calendars in photo interpre- 
tation. Further work must be done in this area to determine the appropriate 
dates for such an effort. Whether reducing the problem to only the "greenness" 
and temporal domains is sufficient for crop identification must also be determined. 

Our results indicated that, except for small grains, the ratio discrimi- 
nant and Boolean classifier were adequate for detecting irrigated land. As 
mentioned previously, the use of a spring date would have caught the small grains 
(the available May date was not used at the time because of high oain settings 
in bands 4 and 5). Some apparent errors were noted where single pixels were' 
classified as irrigated resulting in a slight salt and peooer effect. This 
would probably be accentuated on spring dates where isolated patches of native 
grasses are present. An editing procedure that removed clusters of only one 
or two pixels could be used to remove much of the salt and pepoer pattern. 

This is of course what the manual interpreter does when using a minimum manping 
unit. 


The techniques demonstrated have the advantage of reducing the four channels 
of information from each date to a sinole channel. When multidate Landsat is 
considered, the number of possible band combinations also decreases. The 
reduced dimensionality of the data significantly decreases the cost associated 
with monitoring irrigated land. Because the study area was relatively small, 
there were no problems with the spatial extendability of the selected cutoff 
value. 

The statistics for a 5 date analysis of the same area were computed using 
the 7/5 ratio and Boolean approach. In addition to the June, July and October 
dates used previously, images for 1 May 1976 and 8 August 1976 were added to 
the analysis. 

Using Boolean addition, it was determined which dates yielded the best 
1-, 2-, 3-, and 4-date estimate of irrigated lands (Figure 5-15). July proved 
to be the best single date for discriminating irrigated lands usinq the 7/5 
ratio. A two date analysis increased the amount of irrigated acreaqe success- 
fully discriminated by 12.6 percent with May and August being the best two 
dates. The addition of a third date increased the accuracy of the classifier 
by 4.3 percent. In this case. May, June and July proved to be the best combi- 
nation although any combination of the May date with any two sumner dates was 
close behind. Using four dates of Landsat - May, June, July and October — 
the accuracy was increased by an additional 2.8 percent. Obviously, the 
bracketing of the growing season with spring and fall dates is important to 
capture the temporal dynamics of cropping. The use of all five dates increased 
the classification accuracy to 97.4 percent when compared to a ground truth 
map. The addition of the fifth date improved the performance of the 7/5 ratio 
classifier by 1.8 percent. 
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% CROPLAND CLASSIFICATION ACCURACY 


DATE SIGNIFICANCE TEST 7/5 DISCRIMINATOR 

KERN COUNTY TEST SITE: - 50,000 ACRES TOTAL 
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The ratio classifier appears to perform satisfactorily. The dimensionality 
of the decision process is greatly reduced by collapsing each date to a single 
channel of information. Because the decision logic requires the classification 
of a pixel as irrigated if its 7/5 ratio value is greater than a particular 
threshold value, it is important that an adequate number of dates be selected. 


Classification Using Kauth-Thomas Greeness Transform 

As shown in Figure 5-12, the 7/5 ratio and Kauth-Thomas "greeness" transform 
are highly correlated (r = 0.91). The greenness transform is a fixed linear 
transformation of all four Landsat channels and basically indicates a ratio of 
reflective infrared bands (MSS 6 and MSS 7) to visible bands (MSS 4 and MSS 5). 

Of interest to our research is whether this broader band ratio approach (i.e., 
greenness)is a more effective discriminator of cropland than the simple 7/5 
ratio. Computational differences also need to be considered since the Kauth- 
Thomas transform requires substantially more processing. 

We have created the Kauth-Thomas transform channels for all five dates of 
our Kern County data set. Using an interactive display program to determine 
optimum cutoff values for the greenness channel the three date analysis conducted 
for the 7/5 ratio has been repeated. 

Figure 5-16 is the July image in binary form after the cutoff has been 
determined. This should be compared to Figure 5-13, shown earlier. Figure 5-17 
is the sum of June, July, and October greenness classifications. and should be 
compared to Figure 5-14. 

The two sets of products visually compare very favorably, although a 
detailed statistical comoarison with the ground truth has not yet been under- 
taken. When the processing of the digital ground truth map is completed, a 
systeii.cn.ic comparison of the two approaches will be initiated. 


Classification Using Cluster Labeling 

Because of the simple dichotomous decision by which cropland can usually 
be determined in most of California, our efforts have been oriented towards 
simple discriminant:', like the 7/5 ratio and greenness using a cutcfF or thresh- 
holding approach to classification. The more conventional aporoaches to multi- 
spectral classification typically involve a maximum likelihood decision rule 
and/or the use of measurement space cluster labeling. Since these approaches 
provide a benchmark for comparison, we have also used them in our Kern County 
test site. Film products and a systematic comparison with the digital ground 
truth nap will be generated during the next reporting period. 
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Summary of Developmental Efforts 


In support of Task II activities, the following developmental efforts have 
been initiated or completed by the Santa Barbara group during this year: 

• A sun angle correction procedure using a simple cosine function has 
been implemented in VICAR* 

• A calibration procedure for matching Landsat I and Landsat II radiance 
values has been implemented in VICAR using values provided by ERIM. 

• The ERIM XSTAR program which screens and corrects for atmospheric haze, 
has been reviewed for possible implementation in VICAR. 

• Kauth transformations for "brightness," "greenness," and yellow stuff 
have been implemented in VICAR. 

• Image to image cross-correlation procedures using existing VICAR pro- 
grams are under review as one means for automating portions of the 
registration process. 

• A more flexible, interactive environment, involving multiple video 
monitors, has been developed for rapid ground control point selection. 

• Programs to interface our coordinate digitizer output to IBIS (Image 
Based Information System) format have been partially supported to 
assist our digitization of field data and conversion to image format 
in registration with Landsat data. 

0 A plotting program to view polygon data has been adapted to our 
facility (in support of digital field data conversion). 

0 An interface between VICAR formats (band interleaved by line and band 
sequential) and a statistical analysis package (SAS) has been written 
to facilitate conventional statistical examination of multispectral data. 

0 A program has been developed and implemented to allow interactive selec- 
tion of a classification cutoff point, such as in the 7/5 ratio or greenness 
images. 

Of a more general nature, a limited amount of support was also provided to assist 
the implementation of necessary driver zrd suoport programs for an ooti cal -mech- 
anical filmwriter. This device provides film output from digital data and is 
now being used quite extensively for this project. 


*VICAR is the image processing oackage develoned by JPL; it has been imolemented 
on an Itel AS/6 at UCSB. 


’in 


Evaluating classification accuracies for digital techniques requires the 
systematic comparison of field data, or "ground truth", with image classifica- 
tion results. By digitizing field boundaries and converting this data into a 
raster or grid format it is possible to overlay ground truth with Landsat data 
and systematically analyze the comoosite data set in a geobased information 
system structure. An interactive package of coordinate digitizing programs 
has been develooed at UCSB to provide input to IBIS (Image Based Information 
System). Figure 5-13 is an example of a crop map for our three quad test site 
in Kern County after field borders have been digitized and rasterized. Subse- 
quent pr'ocessing assigns a class number to each field based upon field data. 
Once in the grid format this data can be processed by the full complement of 
VICAR and IBIS programs. 




5.2.2 Sacranento Valley Test Site - UCB 


The registered multitemporal data set described previously (Section 5.1.1) 
was used for a relatively large scale test of the utility of the 
MSS 7/MSS 5 discriminator. The analysis was composed of three major parts: 

(1) creation of the classified 7/5 output; (2) analysis of the classification 
for use with a regression estimator ; and (3) analysis of the mapping accuracy 
of the classified output. 

MSS 7/MSS 5 Ratio Classification 

As described in Section 5.1.1, the multitemporal data set used for this 
sub-task consisted of four files 620 points by 800 lines with three 7/5 ratioed 
bands; one band from each of three acquisition windows (May 22 and 30, August 23 
and 29, October 3 and 4). The data had been rotated to a north-south orientation 
and transformed so that a particular cell represented the same point on the 
ground for all three time periods. Each 30’ segment was made up of a 4 x 4 
matrix of 7.5' quadrangle areas. Combining the four 30' segments (Chico, Corning, 
Maxwell and Sutter) produced the total 1’’ block. 

By 30' block, the 7/5 ratio bands for each date were analyzed and a thresh- 
old value selected to separate irrigated from non-irrigated acreage. It was 
expected that the 7/5 threshold value would vary by date and ground location 
of the 30' block due to: (1) changes in the condition of annual grasslands bor- 
dering the area; (2) changes in type and proportion of crops grown; and (3) 
shifts in crop calendars due to climatic and latitudinal variations. Using 
the RSRP interactive image display system, each 30' block was displayed and 
analyzed separately. 

To set the threshold value for a given band (date) the 7/5 data displayed 
on the TV monitor was compared to the OWR ground data ( lOO"!. ground data was 
available for the entire area). Using a real time masking option, picture 
elements with values below a specified 7/5 value were masked out. This value 
was adjusted until the area shown as irrigated on the display corresponded as 
closely as possible to the irrigated area on the ground data maps. To further 
refine the threshold value selection, statistics (mean values and ranges of 
values) were obtained for the major crops, native vegetation and grassland in 
the area. 

The threshold values for each date (Figure 5-19) were used to create an 
irrigation class map for each 30 minute block. For a given date, the 7/5 ratio 
of each pixel was compared to the selected threshold value and was labeled as 
irrigated if its value was greater than the threshold. After each pixel 
was labeled irrigated or not irrigated on all three dates, the bands were 
combined to create a class map. The three date pattern of irrigation for 


each pixel was then labeled as one of 8 classes: 


Class 1: 

not irrigated on any date 

Class 2: 

irrigated 

in October only 

Class 3: 

irrigated 

in August only 

Class 4: 

irrigated 

in August and October 

Class 5: 

irrigated 

in May only 

Class 6: 

irrigated 

in May and October 

Class 7: 

irrigated 

in May and August 

Class 8: 

irrigated 

in May, August and October 


The resulting class map was displayed and exclusion areas, such as wildlife refuges, 
wildland areas, and large urban areas, were masked out. These exclusion areas were 
determined using DWR's 7.5 minute quadrangle land use maps. The boundaries 
were transferred to the' digital data using a real time polygon delineation routine 
controlled by the cursor. The exclusion areas were masked out and for display 
purposes relabeled as non-irrigated. 


I 

cowisii 

CHICO 

'Uy • ,?8 

■u* • i.:i 1 

Aut». • 1.2’ 

, . V ! 

i Oct. • 1.24 

: 

Oct. • l.O't 

1 

' 1 
1 

1 rAV«t‘LL 

j SUnt, 1 

1 ' 

' -AT • ,08 

1 

1 

• «... 

1 •a* • 1.2* 

! ' 

1 Auc. • 1.2^ 

1 

j Auc. • 1.2’ j 

Oct. • 1.24 

1 

1 

1 Oct. • 1.24 

I 1 

1 


Figure 5-19. 
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A class map for the 1® block was created by sewing together the four 30 
minute segments. An empty file, 1240 points by 1600 lines, was created and the 
four 30 minute blocks (each 620 points by 800 lines) were transferred to the 
proper location. The thresholded, 7/5 1® block was displayed in two ways. First, 
as a two class map (Figure 5-20a), irrigated or not irrigated, where classes 
2 through 8 were combined into one class labeled irrigated. Second, where the 
8 classes were differentiated, showing the temporal pattern of irrigation 
(Figure 5-20b) . 

The classification was then summarized by 7.5 minute quad to output a 
measurement of the proportion irrigated for each quad. Within each 7.5' quad, 
pixel counts were summarized for each of the 8 classes. Using the quad sum- 
maries, the accuracy of the 7/5 discriminant results when used with a regression 
estimator was assessed. 

Accuracy of the Regression Estimator 

For each of the sixty-four 7.5' quadrangles in the 1® block a measurement 
of the percent irrigated (Figure 5-21) as well as DWR's ground truth was avail- 
able. Using the 7.5' quadrangles as sample units, it was possible to estimate 
the parameters of the regression estimator and its variance (Section 4.1.3, 

Equations 2 and 2a, respectively). The estimates were made both with and 
without stratification.* For comparison, estimates using 1x5 mile sample units 
were also made. 

The analysis showed that good estimates (;5* la 95») can be achieved with 
as few as fifteen 7,5' quadrangles as compared to the fifty 1 x 5 mile SUs of 
Task I (Figure 5-22). However, these 15 quadrangles represent an area of ap- 
proximately 233,107 ha (576,000 A) compared to the 54,752 ha (160,000 A) of the 
50 Task I SUs needed to achieve :5o (? 95"^. An important part of the continuing 
work on Task II will be determining the appropriate size of SUs for digital 
analysis procedures. 

Mapping Accuracy 

One advantage of the regression estimator is that it corrects for bias 
(difference from truth) at the Landsat phase. Thus, if the percent irrigated 
is consistently over or under measured on Landsat, the regression estimator 
will give an unbiased estimate without an increase in variance. In generating 
an accurate map (as opposed to an accurate estimate) , however, bias can be 
very detrimental, ‘iap accuracy depends on mimmizing miscalls: (1) errors of 
omission ^missing land that was actually irrigated) and (2) errors of commission 
(classifying land as irrigated that actually was not). 


*The stratified case used two strata based on the Landsat percent irrigated in 
each 7.5' quadrangle: stratum A had 60» irrigated and stratum B had less tnan 
60'. irrigated. Stratum A generally included agricultural practice strata 1, 
2, 7 and excluded areas, while stratum B generally included strata 3, 4, 5, 
and 6 (Figure 4-7) . 
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Black = Non- Irrigated Black = Non-Irrigated 

Red = Irrigated Blue = Irrigated on May only 

Pink = Irrigated on August only 
Purple = Irrigated on October only 
Green = Irrigated on May and August 
Tan = Irrigated on May and October 
Red = Irrigated on August and October 
White = Irrigated on all throe dates 


Figure 5-20, Sacramento 1'’ Block showing land labeled as irrigated. The grid 
superimposed on the classified output outlines 7.5' quadrangles. 



















To assess the mapping accuracy of irrigated land in the Sacramento Valley 
1® clock, 32 of the 7.5' quadrangles were selected from the population of 64 
in a checkerboard pattei'n. For each qi.-adrangle, 24 points were systematica’, ly 
chosen (4x6 grid). The dot Identified a field on which the following d(?:a 
was summarized: (1) irrigated or non-irrigated on the class map (Figure 5-20 
a and b); (2) irrigated or non-irrigated on DWR's land use survey (Figure 1-2); 
(3) land use (crop type) assianment on DWR's land use survey; and (4) field 
size (< 4.05 ha, 4.05-7.69 ha, 7.70-15.78 ha, > 15.79 ha [< 10 A, 10-19 A, 20- 
39 A, j 40 A]). 

For the 763 points over all 32 quadrangles the map accuracy was very good: 
percent correct » 94.0, percent omission = 7.4, and percent commission =6.3. 
Examination of the four 30' blocks separately showed some deviation from the 
overall results (Table 5-5). 


Table '•’-5. Task II Map Accuracy 



% Correct 

% Omission 

% Commission 

r Block 

94.0 

7.4 

6.3 

30' Blocks 

Chico 

94.3 

3.3 

13.0 

Corning 

96.4 

4.3 

6.3 

Maxwel 1 

96.9 

7.1 

1.5 

Sutter 

38.5 

10.4 

5.1 


4 closer examination showed that the errors were dependent on the percent 
irrigated in any particular 7.5' quadrangle (Figure 5-23). For quadrangles 
with low percent irrigated (< 33*) few errors of any kind occured. For moderate 
percent irrigated (33-67,) errors were primarily, but not exclusively, errors 
of commission. For high percent irrigated (' 67 ) errors were primarily, but 
not exclusively, errors of omission. This pattern was significant using a 
Chi-squared test (c ^ 0.0003, Table 5-6). 
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SEPARATE OMISSION 
TOTAL ERROR FOR COffllSSION ERRORS 
EACH 7.5' QUAD EACH 7.5' QUAD 








Table 5-6. Observed pattern of omission and commss’on errors as a function 

of percent irrigated. The value of Chi-square is for a test of the 
null hypothesis that omission and commission error rates are inde- 
pendent of percent Irrigated. 



PERCENT 

IRRIGATED 


V 1 

> 67U 

OMISSION ERROR 

7 

1 

18 

COMMISSION ERROR 

i 3.8, df = 1 
.; a 0.003 

15 

6 


Many omission errors were in irrigated grain fields (13 of 25 points). Comission 
errors were generally in areas of native vegetation (13 of 2' points). 

A simultaneous .analysis of percent irrigated and crop/land use type snowed 
no significant oattern of commission cr omission error, i.e., omission errors 
were not primarily associated with qraiti located in areas of high irrigation; 
commission errors were not primarily associated with native vegetation located 
in areas of moderate irrigation. 

Field site did have an effect on error rate. Proportional '*y more errors 
were made in snail fields than in large fields. This pattern was significant 
using Fisner's exact probability test (. a 0.011, Table 5-7i. 

Table 5-7. Observed pattern of fields correctly and incorrectly classified as 
a function of field sice for the 7.5' Quadrangles. 




Continuing Work 


Further analysis of the Sacramento Valley digital data set is planned for 
1980. To continue the analysis an upgrade of UC Berkeley's Survey Planning 
Model (SPM) will be completed to allow inexpensive simulation of sample frame 
and irrigated (or crop) proportion(s) by spectral class over very large areas. 
The SPM will also allow simultaneous summary of irrigated proportion(s) by 
sampling stratum, measurement errors strata and reporting unit strata. Ad- 
ditionally, the SPM can be used to compute multivariate sample allocation for 
additional sample designs including ratio and regression. A test of the SPM 
on the 1976 Sacramento 1® block area will include the computation of first and 
second stage sample unit population variances for varying sizes of (1) primary 
sample units (PSU) (i.e., 7.5' quadrangles [155 x 200 cel Is], fl ight line strips 
[vertical 20 x 200 cells, horizontal 155 x 25 cells]) and (2) secondary sample 
units (SSU) (i.e., flight line strips, a field or field groups with a 7.5' quad- 
rangle PSD). The test will also allow us to compute hypothetical PSU and SSU 
sample sizes and allocation among strata that minimize total variable cost sub- 
ject to meeting pre-specified sample er>"or requirements for an estimate of 
irrigated proportion. 


5.3 PROPOSED WORK FOR 1980 

The encouraging results of this digital analysis task make a large scale 
demonstration of the 7/5 ratio technique appropriate. Therefore, our objective 
for Task II in 1980 will be to define and demonstrate a Landsat, digitally- 
aided approach to estimating and mapping irrigated land on a hydrologic basin 
basis. Tentative demonstration areas are the Sacramento Valley and Tulare hy- 
drologic basins. 

For each of these areas, three dates of registered 1979 'Landsat 7/5 data 
will be used for Phase I measurement. Setting of the 7/5 irr’gation line (thresh- 
old) for creation of the class map will be done on interactive display and ana- 
lysis systems available at the University. Sample unit data obtained for the 
Task I inventory will provide ground data. 

Within this large scale demonstration several key design and evaluation 
activities will take place. (1) Definition of the proceilure for setting an 
accurate Landsat 7/5 line needs to be refined. (2) Evaluation of map accuracy 
on a point-by-poirit basis by irrigation line tnreshold, region and combination 
of dates should be examined. (3) The form (linear, non-linear) and correlation 
of the Landsat to ground irrigated area relationship needs to be determined. 

(4j Cost and throughput rates should be documented, and (5) the Survey Planning 
Model could be used to determine (a) sample frame characteristics giving the 
lowest total variable cost (TVC) subject to error goals, (b) the impact of Landsat 
classification error on the final error of tne irrigated proportion estimate, 

(c) the type of sample unit selection procedure that minimizes TVC subject to 
error goals and (d) the expected TVC, sample size and allocation among sample 
stages and strata that are necessary to meet given inventory error goals. 
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6.0 CROP T\’?Z ANALYSIS (TASKS III AND IV) 

In the past, Task III (manual analysis) ana Task IV (digital analysis) 
have been studied separately. In 1979, they were generally combined into one 
task with three major sub-tasks. The first sub-task began work on defining 
and understanding the nature of the complex and dynamic agricultural environ- 
ment in California. The second sub-task began establishing a spatial, temporal, 
spectral base for a continuing (and increasing) effort in 1980. Third, the 
basic system for defining DWR's multicrop information requirements was started. 

Three test sites were used for the multicrop analysis this year. UCSB 
studied a two-7.5' quad area on the Oxnard Plain of Ventura County (south coastal 
environment) and the three-quad area in Kern County described in Section 5.0. 

UCB concentrated work on the 1® block in the Sacramento Valley (also described 
in 5.0). See Figure 3-1. 

6.1 WORK COMPLETED BY UCSB 

6.1.1 Multicropping Studies 


Largely as a result of California's generally mild climate, many areas 
support two or more crops per field in a single season. This results in an 
important temporal component in the design of a remote sensing program to 
monitor irrigated croplands. The timing of image acquisition and supporting 
field work is critical to the identification of specific crops and, to a lesser 
extent, irrigated land. 


Multicropping Questionnaire 

To determine the level of multicropping, the croos involved and the 
critical times, a questionnaire was sent to each county's agricultural exten- 
sion office. Thirty-eight counties resoonded. The questionnaire was designed 
to gather information on the total acreage involved in multicropping as well 
as the specific croos and their planting sequences. An additional Question 
probed those factors important in the farmer's decision to multicrop. 

The results indicate that the northern portion of the state does not <^0 
much multicropping (generally less than 5t of the total farmed acreage), with 
that which does exist consisting primarily of small grains as the first crop. 
These are generally followed by pasture crops, sorghum or milo. In the southern 
and central interior portion of the state, considerably more multicroooing occurs 
[Imperial County recorded over 10,500 hectares (100,000 acres) or aporoximately 
of total farmed acreage in 1977]. While the small grains/sorghum or milo 
combinations are coirmon, more vegetables are found in the multicropoing sequence. 
The increased importance of mul ticrooping in these areas is related to the 
warmer climate and availability of irrigation water. In the coastal areas 
(e.g. Salinas Valley and Oxnard Plain), multicroooing tends to be widely prac- 
ticed (over 75''j of the land in truck croos is cropped more than once in a given 
season in Coastal Ventura County, according to local sources) with some fiel-'s 
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supporting three and four different truck crops oer year. In these areas, a 
year-round mild climate, and high land i-ents combine to create dynamic cropning 
patterns. Such patterns reguire oreater reliance on multiiate imaaery. The 
variety and intensity of vegetable mul ticropoinq in many of the coastal sites, 
where unique phenologies are not seen for many crops, may require the definition 
of crop groups. 


Oxnard Plain Multicropping Dynamics Study 


In order co determine the data requirements for monitoring both irrigated 
lands and croo tvpe in a mul ticropoing environment, a four date cropping dynamics 
study was undertaken on the Oxnard Plain. Early reports indicated that the 
combination of mild climate, excellent soils ar'd high land rents resulted in 
intensive multicropping, with many fields yielding three and four crops in a 
single year. Our field work indicated that except for certain stable crops -- 
citrus, strawberries, tur*"* grass and flowers -- crop rotation in this area occurs 
over short periods of time. 

Four visits were made to the study site in 1979 (April, June, August and 
November). With the exception of the last visit, each data collection effort 
was 2 months apart. On each date, two field crews mapped crop tyoe and field 
boundaries on a high altitude aerial •'hoto base (1:32,500 scale). Additional 
information was gathered on crop growth state (emergent, young, mature) and 
conditions of particular interest (aoricul tural land being converted to urban 
use, removal of stable citrus groves for truck crop farming, etc.). The data 
was designed to document the type and location of crops in the area as well as 
the turnover rate for non-stable croos. Data was ^gathered for most of the 
agricultural lands on the Oxnard and Camarillo ’.5 USr-S nuadranoles. 

In order to analyze the data, certain simplifications were required. A 
sampling scheme using a dot grid with 0.5 inches between each point was employed. 

The dot grid sracinq was chosen on the basis of data manageabi 1 i ty rather than 
for strict statistical reasons. Each dot represented approximately 10 hectares 
(25 acresl. There were a total of 1464 dots for which there was field data on 
at least one of the four dates. 

Table b-1 shows the crop types or field conditions seen on each of the four 
dates. The n:ajor crop types are lemons, strawberries , sod and assorted vege- 
tables. Of additional importance is the ^air1y large proportion of fallow 
croplang. In most cases, this category represents fields that are in prepara- 
tion ^or planting. It is not clear if our field visits merely coincided with 
this phenomenon or if a large proportion of the fields can be expected to be 
fallow at any one time. 

To evaluate the dynamics of mul ticrooping two sets of tests were con- 
ducted -- three date and a two date analyses. For the three date tests (Table 6-21. 
those *'ields for which data were available on three consecutive dates were 
examitied. '^or the April- ’une-August seguence, there were 51S fields; for June- 
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Table 6-1 


Crops and Field 

Conditions 

Sampled on 

the Oxnard Plain 


Crops 

Apri 1 

June 

August 

November 

Fallow 

326 

250 

249 

357 

Artichokes 


1 



Asparagus 

1 

2 



Cole Crops 

26 

27 

24 

215 

Celery 

106 

39 

21 

234 

Lettuce 

69 

7 

7 

39 

Melons/Squash 


16 

39 


Peas 

16 


5 


Spinach 

27 

16 

1 


Tomatoes 


144 

198 

21 

Strawberries 

42 

36 

42 

60 

Peppers 


28 

72 

16 

Parsley 

7 


13 

17 

Mi sc. Truck Crops 

10 

26 

12 

29 

Corn 


1 

1 


Beans (dry) 


219 

410 


Flowers 

21 

6 

4 

4 

Sod 

21 

29 

40 

47 

Lemons 

127 

163 

183 

198 

Oranges 

-> 




Avocado 

3 

3 
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August-November , there were 92S. Each field was then examined to determine the 
nature of the cropping seouence. Four characteristic sequences were noted -- 
the same croo or field condition on three different dates; the same croo or 
field condition on two consecutive dates; the same croo or condition on the 
first and third date, separated by a different crop on the second date; and, 
three different crops over the three date sequence. Table 6-2 shows the results 
for all three date sequences. In this case, fallow cropland conditions are 
treated as a crop-type. 

If those fields where fallow conditions were noted on at least one of the 
dates are excluded, there is a reduction of more than 50 percent of the fields 
(Table 6-2). Nevertheless, those cases where different crops are found on each 
different date are a significant part of the total. If the fallow lands and 
stable areas (citrus and turfgrass primarily) are ignored, this condition, 
representative of intensive multi cropping, represents between 35 and 45 percent 
of fields measured here. 

The two-date test was designed to determine the amount of mul ticroppinq 
that could be expected over a two-month period. Three two-date sequences 
(April-June, June-August and August-November) were examined. For the April- 
June test there were 565 fields; June-August, 986; August-November, 1186. 

The results were tallied with and without fallow field conditions. The results 
are shown in Table 6-3. For each two date sequence (with or witno^'t fallow 
land), the condition of different crops beir.q seen on different dates pre- 
dominates. 

The results of this sub-study indicate that in the type of environment 
characteri.’ed by the Oxnard Plain, where mul ticroooinq is commonly practiced, 
a significant number of air photo or Landsat acquisitions would be required 
to accurately determine irrigated acreage. Certainly the inadequacy of the 
current DWR procedure is evident here. Classification of particular croo types 
would probably be difficult for most of the truck crops because of their short 
time in the field and the fact that many, such as celery and lettuce, are found 
on all four dates and do not appear to have a unique ohenological cycle. The 
definition of satisfactory crop groupings may alleviate some of these problems. 
Perhaps the principal value of Landsat in this situation is the ability to define 
intensive mul ticrooping practices. 

Our original research plan for the Oxnard Plain test site included a 1979 
multidate Landsat analysis. A review of Landsat imagery available for 19’9 
indicates, however, that only one date of imagery was acquired during the 
primary growing season (Apri 1 -September) . This appears to be due to Landsat 
data processing problems rather than cloud coverage, since image*\v was 
apparently not acquired for several clear day overpasses. 
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Table 6-3 


Oxnard Plain Mul ticroDping Dynamics 
Analysis of 2 Consecutive Dates 




Incl uding 

Fallow Land 


Date 


n 


E. 

Total 

Apri 1 /June 

148 

(26.2^.) 

417 

(73. S%) 

565 

June/Auqust 

302 

(30.6'M 

684 

(69.4*.) 

986 

August/November 

295 

(24.9'M 

391 

(75.1*o) 

1186 




Excluding Fallow Land 


Date 


E 

E 

Fallow 

Apri 1 .June 

102 

08.1*-) 

129 (22. 8T.) 

334 (59. r.) 

June/August 

245 

(24.9T) 

(20.3T-) 

360 (36.5*.) 
442 (37.3*.l 

381 (38.6*.) 

August/November 

241 

503 (42.4*) 
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6.1.2 Crop Phenology 


Two crop phenology projects have been undertaken during this reporting 
period. The first involves the expansion of our crop phenology diagram series 
to include alfalfa, melons, and sugarbeets, in addition to cotton and small 
grains, previously completed. The second task involves a crop phenology 
survey of the Central Valley under the direction of Or. Michael Nuttonson, 
former director of the Crop Ecology Institute. Dr. Nuttonson is designing 
a survey form and collecting necessary collateral data. The survey will 
be conducted in cooperation with the University's Agricultural Extension 
Service. 


6.1.3 Digital Crop Identificatio n 

Digital crop identification tests have been initiated in both Tulare 
and Kern Counties. As mentioned earlier, tests were scneduled for Ventura 
County (Oxnard Plain site) but suitable imagery has apparently not been 
acquired. Nine dates during the 1973 crop season have been obtained for 
Tulare County and reformatted into VICAR format. This data set will now 
be registered using the cross correlation techniques discussed earlier and 
a limited amount of D'.^R's ground truth data will be digitized and registered 
to the data set. This will allow automated labeling and performance 
evaluations . 

As an alternative to the Ventura data set, we have begun crop identification 
tests using the five date Kern County data for 1976. Field crop maps acquired 
by a water district have been digitized for a three - ' 1/2 minute quadrangle 
site and registered to the Landsat data, as discussed .-arlier in our Task U 
section. Both supervised and unsupervised (clustering) approaches are being 
explored for the crop identification tasks. 


6.2 WORK COMPLETED BY UCB 

Working in the same Sacramento Valley 1® test site discussed in Section 
5.0 UCB addressed two general issues. First, basic spectral /temporal data is 
needed on the major crops of the area as input for inventory design and class- 
ification procedures. Second, OWR ultimately requires output in a map-like 
form (preferably 7.5' quad) with the capability of recombining the data shown 
on the map in a variety of ways (i.e., by water district, study area). 

The first step in pursuing spectral /temporal pattern of agriculture in 
this area was to determine the major crops and their spatial distribution. Using 
Countv Agricultural Cotmissioner 's reports and DWR's 7.5' quads and tabulated 
summaries, a general description o^ agriculture in the whole Sacramento Valley 
(U counties) and a detailed analysis of the five counties covered by the 1' 
block was done. For each of these five counties (Butte, Colusa, Glenn, Sutter 
and Tehama) reported crop acreages were tabulated (Taole 6-4a). Crops were 
selected for specific study if they represented either 5' of any single county's 


CRCrS - iS76 

Butte 

CCwUtA 

•?ARl£’ 

i:,QCO 

12,300 

Scans 

3, ICO 

9,375 

•Corn 

8,300 

16,000 

•Hay, AlFalpa 

6,500 


JRAIN 

3,CC0 

1 

Other 

1,150 

- 

•Oats 

5,000 

- 

•Pastjre, Irr. 

19, SCO 

12,000 

•RiC£ 

70,000 

108,000 

Safflower 

- 

8,100 

•Sorghum 

11,200 

9,150 

Sugar Beets 

4,040 

12,300 

•Xheat 

27,600 

39,000 

*Fruit i Hut Crops 

64,376 

22,150 

Seed Crops 

20,108 

6,145 

YEGETASCE C.90PS 

"cLONS 

- 

• 

Pj?*px:ns 

- 

- 

SiUASH 

- 

" 

•'c.MATOES, Canning 

- 

3,000 

Fresh 

- 

- 

CcRN, Sweet 

- 

- 

Vater.'ielons 

- 

- 

.'IlSCEL.-ANEOUS 

1,332 

• 



Eut-te 

COLJSA 



• 

Barley 



Corn 


• 

Rice 

• 

• 

Wheat 

‘Jay 

• 

• 

Pasture 

SORGHLH 

• 


Fruit i Nuts 
Tomatoes 

• 

• 


GlcNN 

Sutter 

Tehama 

Total 

To*al 

3,500 

19,000 

7,100 

60,200 

3.1 

2,552 

11,575 

- 

31,503 

2.7 

3,000 

15,592 

1,220 

49,612 

U.2 

16,000 

9,715 

4,000 

39,655 


_ 

7,174 

2,900 

14,274 

1.2 — 0.8 

3,000 

20,g30 

2,000 

26,580 

L « A. 

- 

1,043 

2,500 

3,543 

.7 

36,000 

24:000 

32,200 

124,000 

10.3 

53,lw9 

73,964 

- 

310,000 

26.2 

3SS 

3,226 

- 

17,254 

1.5 

6,500 

23,177 

1,S50 

5LS77 

4.4 

7,222 

5,543 

815 

30,525 

2.6 

22! 500 

40,000 

14,000 

143,100 

12.1 

21,052 

46,233 

25,553 

180,139 

15.2 

5,613 

19,301 

3,077 

55,044 

4.7 


1,356 

. 

1,556 

3.4 

- 

1,255 

- 

1,265 


_ 


- 

522 

' 

- 

24,300 

- 

32,500 

1 

- 

35 

- 

85 



173 

- 

173 


. 

215 

- 

215 


h33G 

211 

95 

3,583 



GlEnn 

Sutter 

"ehama 

All 

- 

• 

• 

• 

• 

• 


• 

• 

• 

• 

• 

• 

• 

• 

• 

• 

• 


• 

• 

• 

• 



• • C<)OP 9£P9£SENT£D SI 09 iiR£ATE9 OF 9£P0RT£D ACREAGE 

- • Crop rfpresented eetween 't a‘»’> o* rfforted acreage 


0RCUPIN3 OF CRO»S FOR TaSK IV ANAL>SIS: 


VXi 

SrA.L GRAINS '-V-cAT :a 9 .ey 1 

OPLHiKDS I'Rur i ’iuT' 

P^o’URE (PaS’URE AN2 -AV,' 

SCRGHir 

CCPN 

'FATSES 


P3-*. "a" lists t‘'e acreages ot crcDS in 
blocl^. Part 'b" sunnari ces those crops 
of the tabdlated acreage. 


tre five counties located ir^ghe 
tnat representej apcroxiiratel^ E 


total or, when combined, 5' of all the counties' acreage. Table 6-4b snows 
the crops which met or marginally met the 5* limit. The nine crops listed on 
Table 6-4b were further combined to the crop groups shown at the bottom of the 
figure. 

It was felt that a sample of the 64 possible 7.5' quads in the area would 
provide ample statistics for our use. Using the 1976 DWR ground data, each 
7.5' quad in the 1® block was examined for the presence of agriculture. Quads 
with less than 5* of their total area in cultivation were eliminated. The re- 
maining 38 quads were divided into blocks of three and the one of every throe 
with the most even distribution of the major crops was selected for analysis 
(Figure 6-1 ) . 



Figure 6-1. This diagram of the 64 7.5' quads in the 1" Sacramento Valley block 
shows (1) quads having less than 5'.- agriculture (no shade or pattern) 
(2) quads having 5* or greater agriculture that were not chosen for 
analysis (dotted pattern); and (3) quads having 5' or greater agri- 
culture that were chosen for analysis (gray). Quads with the most 
even distribution of crops (based on DWR's land use survey summaries) 
were selected. The Sutter 30' block is located in the southeast 
corner and outlined with a heavy line (the irregularly shaped poly- 
gon is the perimeter of the Sutter Buttes). 


131 





Statistical data was obtained for the major crops in these selected 7.5' 
quads. Fields greater than 10 pixels, eliminating border pixels and field 
anomalies, were sampled for: (1) the mean 7/5 ratio value, (2) standard devia- 
tion, and (3) range of values. These statistics were tabulated by 7.5' quad, 
by 30' block and by 1® block for each of the three dates (May 30, August 28, 
October 4). 

These data were examined for crop separability. Although these dates were 
selected for differentiation between irrigated and non-irrigated land, some 
crop types and groups were separable. Both small grains and rice have 7/5 values 
on these dates that allow them to be identified with little confusion. However, 
pasture and orchard appear similar, as do corn and sorghum. To spectrally sep- 
arate these crops requires additional Landsat acquisitions. 

The Sutter 30' block was selected for further analysis because of the high 
proportion of agriculture and the availability of additional dates of digital 
data. Two additional dates were chosen for analysis. First, May 4 for addi- 
tional input to differentiate (1) small grains from native grasses and (2) 
pasture from orchard. Second, June 26 was chosen to (1) separate corn, with 
its earlier emergence, from sorghum and (2) identify tomatoes. The same 7.5' 
quads and sampled fields were used in this analysis. In addition to the 7/5 
■"atio band, a 5 4 ratio band and a sun angle corrected Euclidean brightness 
band were created for each of the 5 dates. For each of the ratioed bands and 
brightness band, statistics were combined over the 30' block by crop giving the 
mean value, standard deviation, and covariance ^or the f'^ve dates. These stat- 
istics were used to seed the unsupervised classifier (CLUSTER) on the RSRP 
interactive system. 

A subset of dates was chosen to facilitate processing. On examination of 
the statistics and crop calendars, the May 4, June 26, and August 2S dates were 
chosen as giving the maximum separability between crops (Figure 6-2). Prelimi- 
nary classification was done on every fourth pixel for six iterations giving a 
maximum of 30 classes and these results were used to seed the final clustering 
and labeling. 

The combined statistics for the 30' block were used to label each of the 
30 clusters. The statistics for the 7/5, 5/4, and brightness bands for each 
crop were compared to the mean values per band for each cluster (Table 6-5a). 

The clusters were given tentative lapels as either: (!' one of tne major crop 
types or (2) as other (Table 6-5b). Clusters with the same potential label were 
grouped and mapped as one color on the RSRP interactive display system. This 
display was checked against the DWR ground data maps (Figure 6-3). Visual com- 
parison of the resulting output was encouraging, although no detailed statistical 
evaluation of the result has been gone to date. 

The function of much of the crop type tasks has been to provide baseline 
information and output products the Department of Water Resources' evaluation. 
This basic data provides the background necessary to begin defining OWR's infor- 
■^ation requirements in reference to the use of a Landsat-based system. Informa- 
tion needs for both inventory and mapping systems must be carefully outlined for 
fjrthe'* work to proceed logical’y and efficiently. 
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'^iGuro 6-2. 7 5 ratio and 5 '-1 rat’o and Euclidean brichtness drapned aoainst 

■ ■ ' »hg fiyg dates studied over t‘ie Sutter 30' Mock. T^'e seven -najor 

croDS found in the Sutter are^ a>^e shown. 
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Table 6- 5a. Mean values of 7/5, 5/4 and brightness for each cluster. 
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Table 6-5b. Cluster labels based on values shown above. 

CLUSTER CLUSTER 


1 - 

grair. 

16 

- 

rice 

2 - 

rice 

17 

- 

rice 

3 - 

corn 

18 

- 

rice 

4 - 

sorghum 

19 

- 

other 

5 - 

tomato and other 

20 

- 

rice 

6 - 

pacturc 

21 

- 

other 

7 - 

orchard 

22 

- 

other 

3 - 

other 

23 

- 

other 

0 . 

rice 

24 

- 

ore hi ,'d 

10 . 

rice 

25 

- 

rice 

11 - 

sorghum 

26 

- 

grain/sori,: un double cropped 

12 - 

other 

27 

- 

other 

13 - 

pasture 

:s 

- 

other 

14 - 

orcfiard 

29 

- 

ri ce 

15 - 

grain 

30 

- 

pa , ture 


13i 



Blue = Small grains 
Brown = Pasture 
Gray = Other 
Green = Sorghum 
Orange = Rice 
Red = Orchard 
Yellow = Corn 


Figure 6-3. Sutter 30 minute block. Classification of crop type based on 
cluster label ing. 




To guide the development of a proper inventory system, certain key questions 
need to be addressed. Some of the questions are: 

• What are the parameters for which estimates are desired? 

• Proportion of area by crop type? 

• Change in proportion by crop type? 

• Water demand by crop type? 

• What are the target CJ'ops for which error will be controlled? 

• What are the error goals for the target crops? 

• Are there other crops or land use classes for which parameter 
estimates are desired? 

• What are the ta»'get populations (areas) for which information 
is required? 

• At what reporting unit are the data to be summarized? 

• At which reportitiq level is error to be controlled? 

• What are the consti-aints on the system? 

• Cost? 

• Timeliness? 

• Institutional capability? 

Producing an accurate map requires responses to a different set o? queries; 

. What are the crop or land use classes to be mapped? 

• What is the minimum acceptable classi fication accuracy bv class? 

• What is the maximum acceptable field boundary error? 

• What are required characteristics of the map product? 

• What are the constraints on the system? 


Working closely with PWR. we anticipate specifying a set of inventory and mapping 
goals and constraints and proceeding with a larger scale demonstrat ion 'n lOSO 


Appendix I: A Cofr.parison of Estimate Accuracy and Costs for Segment ana 

Transect Sampling. - UCSB 

In p»"eparation •'or this year's APT segment sampling, the cost of 
acquiring sample segments in a completely random manner using medium scale 
aerial photography appeared large and perhaps out of proportion to true 
statistical value. Although the large cost may be due to overly conservative 
estimates in terms of the number of segments that can be flown per day, it 
seems appropriate that more economical sampling schemes also be considered. 

An obvious alternative to random sampling designs which would make good use 
of photographic plane time is transect sampling along predetermined flight 
lines. The following analysis looks at this type of samoling scheme, which 
would fit well into DWR's present procedures. It is assumed that measurement 
error would remain the same regardless of sample design, so the two areas of 
concern are sample error and costs. 

Our earliest work in this area was covered in the semiannual progress 
report o'^ June, 1979. Using a similar aooroach to the one cited here, it 
was found that random segment and systematic transect samples, containino 
app>"oximately the same amount of area, yielded estimates of irrigated acreage 
that were not significantly different from one another. A two-to-one cost 
differential for ohoto acquisition made transect sampling aooear to be 
preferable to the segment aporoach. Subsecuent to that effort, it was brought 
to our attention that the variance in the transect samole had not been com- 
puted oroperly (each transect had been treated as a qrouo of 1.6 X 8 km 
(1X5 mile) segments rather than a sinole sample). The work oresented here 
is a more thorough comoarison of the two samoling aooroaches. 


Comparison of Estimate Errors 

Using a mao of active cropland in Kern County for the 1973 growing season 
(Figure I-l), a test was conducted to determine the impact of segment and tran- 
sect sampling on the estimate of crooland acreaoe. Data oreoaration involved 
the tabulation of the orooortion of each souare mile devoted to aariculture 
to the nearest 5 percent. The tabular data was placed into a computer 
data file so that each souare mile could be accessed bv its X-Y coordinate. 

This resulted in a data matrix of 62 X 79 elements reoresentina 12,539 souare 
kilometet"s (4898 square miles). However, because of a large amount of area 
which was not used for agricultural purposes, there were only 6902 square 
kilometers (2696 square miles) for which data was compiled. Analysis of the 
entire population o" 2696 elements showed that the averaoe amount of irrigated 
land was 53.2 percent with a standard deviation of 41.3 Percent. 

The test consisted of determining the number of segment and transect sam- 
ples that must be taken before the estimate of crooland acres becomes stable. 
Stability in the estimate was determined by taking the standard deviation of 
the estimate over 100 iterations for each sample size. 
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Based on earlier work documented in the ILP study, a single stratum 
(the Kern County study area was treated as such) of this size would require 
40 segment samples. This would yield coverage of approximately 512 square 
kilometers (200 square miles). To test the behavior of the estimate, com- 
putations were made for sample sizes of from ?J to 70 segments (by 5 segment 
increments). In all cases sampling was random and without replacement. Table 
I-l shows the average estimate derived from each sample size and the standard 
deviation of the estimate. For samples sizes of 60, the variance in the 
estimate is significantly reduced and remains relatively stable or reduces 
gradually for larger sample sizes (Figure 1-2). 

Random transect sampling was undertaken for a range that would result 
in approximately the same amount of area being covered as in the random seg- 
ment test. One hundred iterations of transect sample sizes ranging from 3 to 
10 were made. All transects were taken randomly and without reolacement. 

The results are shown in Table 1-2. 

As can be seen in Figure 1-3 for any given amount of area to be covered, 
the variance of the estimate is greater for the transect samole than the 
segment samole. This indicates that a greater amount of area must be flown 
to yield a stable and dependable estimate of irrigated acreage. 

A test was also conducted to assess the imoact of systematic samolino as 
opposed to total randomness. The map was divided into 8 fields of ten tran- 
sects each. Using a samole size of 8 transects, the variance in the estimate 
over 100 iterations was computed. For an areal coverage of aooroximately 440 
square kilometers an estimate of 58.41 percent was computed. The standard 
deviation of the estimate using systematic transect sampling was 4,2 percent, 
whereas random transect and random seoment sampling yielded standard deviations 
of 6.2 percent and 4.9 percent, respectively. It appears that the distribution 
of agricultural land for this area is clumped such that a systematic type 
of approach results in a better estimate. The improved results seen here 
are analagous to improvements in samolino when stratification is used. While 
not tested here, it may be that a more systematic (or stratified) approach to 
random segment sampling would improve the performance of that orocedure. 

Cost Comparison 

A comparison of costs was somewhat difficult because it required that we 
make assumptions about the time required to acquire the aerial photographs. 

Current estimates are that aooroximately 20 five-mile random samnle segments 
can be flown per day. According to Fred Stumof (DWR-San Joaouin District), 

Fresno County's 526,000 hectares (1.3 million acres) can be flown in 5 days 
using a transect approach. This is eouivalent to 640 km (AGO miles) cer day. 

Using the average area covered as a measure of total transect length 
(Table I-*2) and a value of 126 km (79 miles) as the inter-transect distance (this) 
is the maximum width of the study area), the cost, in flioht-days, at the 
rate of 640 km oer day, was computed. This is shown in Figure 1-3. Also seen 
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Table M 


RarJom Segment Sampling Results 


“Segments 

Mean 

Variance 

Standard 

Deviation 

km^ 

20 

58.88 

50.45 

7.10 

160.30 

25 

58.95 

39.09 

6.25 

200.60 

30 

53.97 

26.01 

5.10 

240.90 

35 

58.44 

34.50 

5.87 

281.62 

40 

56.44 

26.04 

5.10 

325.12 

45 

57.96 

24.14 

4.91 

374.85 

50 

59.74 

23.64 

4.86 

402 . 76 

55 

60.44 

25.45 

5.05 

443.56 

60 

58.84 

17.28 

4.16 

480.27 

65 

59.34 

14.40 

3.79 

523.49 

70 

58.91 

15.42 

3.93 

570.10 


14 



random segment 
samples 




Table 1-2 


Random Transect Samolinc Results 


^Transects 

Mean 

Variance 

Standard 
Devi ation 


1 

56.40 

440.65 

20.99 

61.26 

0 

L- 

56.10 

216.65 

14.72 

116.44 

3 

59.10 

125.96 

11.22 

175.80 

4 

57.18 

95.23 

9.76 

223.80 

5 

57.10 

70.67 

8.41 

285.25 

6 

56.41 

50.39 

7.10 

336.88 

7 

57.21 

42.05 

6.48 

405.79 

8 

58.22 

39.58 

6.29 

461.53 

9 

57.87 

37.59 

6.13 

513.18 

10 

57.53 

3Q.38 

6.28 

568.46 

11 

58.31 

28.17 

5.31 

633.89 

12 

58.21 

26.03 

5.10 

696.50 
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rement for Sditiple Acquisition 
Area Covered) 



random transect 
samples 

fandom segment 
samples 




here is the number of flight-days required to gather random segments, at the 
rate of 20 per day. As can be seen in Figure 1-3, flight time requirements are 
significantly greater for segment sampling, primarily as a result of increased 
inter-segment search time. The operational simplicity of the transect proce- 
dure results in lower acquisition costs. Because DWR's photo acquisition costs 
have been approximately $135 per hour in the past, flight time can be a sigini- 
ficant part of a multistage sampling program employing aircraft. 

Figure 1-4 shows the cost in flight days of both the segment and transect 
approach for given levels of variance in the estimate. In all cases the ran- 
dom transect cost is lower than that for segment sampling. Furthermore, as 
greater reductions in variance are achieved through higher sampling rates the 
cost difference between the two techniques increases. 

These results indicate that transect sampling may be a cost effective 
alternative to random segment sampling. The chief trade-off involved is the 
requirement for acquisition of greater photo coverage which increases the 
cost of the photo interpretation phase of the project. 
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Append i x II: Accessibility Categories Used to Predict the Weighted Averag e 

Relative Cost (c^) 

Assuming that some sample units require more time to access and ground 
check, a relative measure of this different accessibility in terms of a cost 
ratio was needed for the optimal sample unit allocation process. For the 
present inventory, each sample unit was assigned one of three accessiblity 
types : 

Type A: The sample unit was (1) near other sample units, (2) near a 

good access road, and (3) had a road network going through 
or by it. 

Type B: The sample unit lacked one or two of the Type A requirements 

to some extent. 

Type C: The sample unit lacked all three criteria of a Type A unit 

to some extent, or one criteria was completely lacking. 

Within a county, each polygon defined by the merged stratification 
(Section 4.2.4) was evaluated using available USGS 1:250,000 scale topographic 
maps and assigned an accessibility code. Relative costs associated with each 
accessibility type were estimated from experience in the 10-County and 14- 
County studies and current DWR costs on the following parameters: 

• The cost to photograph one sample unit using a 35 mm camera 
in a light aircraft is $32. 

• In areas defined as Access Type A, field crews could reasonably 
be expected to ground check five sample units per day. 

• In areas defined as Access Type B, four sample units per day. 

• In areas defined as Access Type C, three sample units per day. 

• The cost of maintaining a ground crew in the field is S220/day. 

Using these figures, relative costs for ground data collection were estimated 
for each accessibility type. The relative costs shown in Table II-l were used 
to predict the average relative cost (c^. ) for each stratum. 


Table II-l. Cost of collecting sample unit ground data, by accessibility type. 


a Ground crew cost per day 
^ Number of sample units 
collected per day 


graphy per sample unit 


Accessibility Type 

COST 


Relative Cost , 
c^ (COST/76) 

A 

$220 ^ 

$32 » $76 

i 

1 .00 1 
1 

B 

$220 ^ 

$32 = $87 

1 

1 .14 1 

1 

1 

C 

$220 ^ 

$32 = $105 

1 

1 .38 


As these values for photo and ground data acquisition are general ap- 
proximations, more refined estimates will be realized at tne conclusion of 
the 1979 inventory for future operational sample allocation efforts. The 
DWR ground Survey crews recorded time spent ground checking the sample units 
allowing these more refined estimates for future surveys. 
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1979 
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Proceedings Third Conference on the Economics of Remote Sensing 
(November 1979 ) . 

Tinney, L., J. Holloway, J. Baggett and J. Estes. "A Multistage Mapping 
Approach to Inventorying Irrigated Cropland Using Landsat and 
Aircraft Imagery." Proceedings 5th Pecora Symposium (July 1979). 

Tinney, L., S. Wall, R. Colwell and J. Estes. "Irrigated Lands Assessment 
for Water Management - Applications Pilot Test." Proceedings 5th 
Pecora Symposium (July 1979). 

Wall, S.L. "California's Irrigated Lands: Landsat-Based Estimation and 
Mapping." Proceedings Symposium on Identifying Irrigated Lands Using 
Remote Sensing Technologies Olovember 1979). Missouri River Basin 
Commission, Omaha, Nebraska. 

Wall, S.L. and J. Baggett. NASA Grant NSG-2207 Quarterly Progress Reports 
for the periods: 

1 December 1978 - 30 March 1979 
1 April 1979 - 30 June 1979 

1 July 1979 - 30 September 1979 

Space Sciences Laboratory, University of California, Berkeley. 

Wall, S.L., R.U. Thomas and L.R. Tinney. "Landsat-Based Multiphase 
Estimation of California's Irrigated Lards." Joint Proceedings 
of the ASP-ACSM 1979 Fall Technical Meeting (September 1979), 221-236. 

1973 

Wall, S.L., L. Tinney, J. Baggett, C.E. Brown, K.J. Dummer, T.W. Gossard, 

J. Holloway, T. Torburn and R.W. Thomas. "Irrigated Lands Assessment 
for Water Management - Applications Pilot Test (APT). Annual Progress 
Report: 1 November 1977 - 31 December 1978. Series 20, Issue 7. Space 
Sciences Laboratory, University of California, Berkeley. 

Wall, S.L., L. Tinney, J. Holloway and T. Torburn. "Irrigated Lands Assess- 
ment for Water Management - Applications Systems Verification and 
Transfer (ASVT). Semi-Annual Progress Report: 1 November 1977 - 30 
April 1978. Series 19, Issue 59. Space Sciences Laboratory, University 
of California, Berkeley. 

Wall, S.L., L. Tinney, C.E. Brown, S.J. Daus, C.E. Ezra, T. Torburn and 
V.L. Vesteroy. "Determining the Usefulness of Remote Sensing for 
Estimating Agricultural Water Demand in California. Annual Report 
1 January 1977 - 28 February 1978. Soace Sciences Laboratory, University 
of California, Berkeley. 
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